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TITLE OF THE INVENTION 

VACCINES COMPRISING SYNTHETIC GENES 

BACKGROUND OF THE INVENTION 
5 1. HIV Infection: 

Human Immunodeficiency Vinis-1 (HIV-1) is the 
etiological agent of acquired human inunune deficiency syndrome 
(AIDS) and related disorders. HIV-l is an RNA virus of the 
Retroviridae family and exhibits the 5'LTR-gag-pol-env-LTR3' 

10 organization of all retroviruses. In addition, HIV-1 comprises a handful 
of genes with regulatory or unknown functions, including the tat and 
rev genes. The env gene encodes the viral envelope glycoprotein that is 
translated as a 160-kilodalton (kDa) precursor (gpl60) and then cleaved 
by a cellular protease to yield the external 120-kDa envelope 

1 5 glycoprotein (gp 1 20) and the transmembrane 4 1 -kDa envelope 
glycoprotein (gp41). Gpl20 and gp41 remain associated and are 
displayed on the viral particles and the surface of HIV-infected cells. 
Gpl20 binds to the CD4 receptor present on the surface of helper T- 
lymphocytes, macrophages and other target cells. After gpl20 binds to 

20 CD4, gp41 mediates the fusion event responsible for vims entry. 

Infection begins when gpl20 on the viral particle binds to 
the CD4 receptor on the surface of T4 lymphocytes or other target cells. 
The bound virus merges with the target cell and reverse transcribes its 
RNA genome into the double-stranded DNA of the cell. The viral DNA 

25 is incorporated into the genetic material in the cell's nucleus, where the 
viral DNA directs the production of new viral RNA, viral proteins, and 
new virus particles. The new particles bud from the target cell 
membrane and infect other cells. 

Destruction of T4 lymphocytes, which are critical to 

30 immune defense, is a major cause of the progressive immune 

dysfunction that is the hallmark of HIV infection. The loss of target 
cells seriously impairs the body's abiHty to fight most invaders, but it 
has a particulariy severe impact on the defenses against viruses, fungi, 
parasites and certain bacteria, including mycobacteria. 
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HIV-1 kills the cells it infects by replicating, budding from 
them and damaging the cell membrane. HIV-1 may kill target cells 
indirectly by means of the viral gpl20 that is displayed on an infected 
cell's surface. Since the CD4 receptor on T cells has a strong affinity for 
5 gpl20, healthy cells expressing CIM receptor can bind to gpl20 and 
fuse with infected cells to form a syncytium. A syncytium caimot 
survive. 

HIV-1 can also elicit normal cellular immune defenses 
against infected cells. With or without the help of antibodies, cytotoxic 
10 defensive cells can destroy an infected cell that displays viral proteins on 
its surface. Finally, free gpl20 may circulate in the blood of individuals 
infected with HIV-1. The free protein may bind to the CD4 receptor of 
uninfected cells, making them appear to be infected and evoking an 
inmiune response. 

1 5 Infection with HIV- 1 is almost always fatal, and at present 

there are no cures for HIV-1 infection. Effective vaccines for 
prevention of HIV-1 infection are not yet available. Because of the 
danger of reversion or infection, live attenuated virus probably cannot 
be used as a vaccine. Most subunit vaccine approaches have not been 

20 successful at preventing HIV infection. Treatments for HIV- 1 infection, 
while prolonging the lives of some infected persons, have serious side 
effects. There is thus a great need for effective treatments and vaccines 
to combat this lethal infection. 

25 2. Vaccines 

Vaccination is an effective form of disease prevention and 
has proven successful against several types of viral infection. 
Detemiining ways to present HIV-1 antigens to the human immune 
system in order to evoke protective humoral and cellular immunity, is a 

30 difficult task. To date, attempts to generate an effective HIV vaccine 
have not been successful. In AIDS patients, free vims is present in low 
levels only. Transmission of HIV-1 is enhanced by cell-to-cell 
interaction via fusion and syncytia formation. Hence, antibodies 
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generated against free virus or viral subunits are generally ineffective in 
eliminating virus-infected cells. 

Vaccines exploit the body's ability to "remember" an 
antigen. After first encounters with a given antigen the immune system 
5 generates cells that retain an inmiunological memory of the antigen for 
an individual's lifetime. Subsequent exposure to the antigen stimulates 
the immune response and results in elimination or inactivation of the 
pathogen. 

The immune system deals with pathogens in two ways: by 
10 humoral and by cell-mediated responses. In the humoral response 
lymphoc>tes generate specific antibodies that bind to the antigen thus 
inactivating the pathogen. The cell-mediated response involves 
cytotoxic and helper T lymphocytes that specifically attack and destroy 
infected cells. 

15 Vaccine development with HIV-1 virus presents problems 

because HIV-1 infects some of the same cells the vaccine needs to 
activate in the inunune system (i.e., T4 lymphocytes). It would be 
advantageous to develop a vaccine which inactivates HIV before 
impairment of the immune system occurs. A particularly suitable type 

20 of HIV vaccine would generate an anti-HFV immune response which 
recognizes HIV variants and which works in HIV-positive individuals 
who are at the beginning of their infection. 

A major challenge to the development of vaccines against 
vimses, particularly those with a high rate of mutation such as the 

25 human immunodeficiency virus, against which elicitation of neutralizing 
and protective immune responses is desirable, is the diversity of the 
viral envelope proteins among different viral isolates or strains. 
Because cytotoxic T-lymphocytes (CTLs) in both mice and humans are 
capable of recognizing epitopes derived from conserved internal viral 

30 proteins, and are thought to be important in the immune response 

against vimses, efforts have been directed towards the development of 
CTL vaccines capable of providing heterologous protection against 
different viral strains. 
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It is known that CD8+ CTLs kill virally-infccted cells when 
their T cell receptors recognize viral peptides associated with MHC class 
I molecules. The viral peptides are derived from endogenously 
synthesized viral proteins, regardless of the protein's location or 

5 function within the virus. Thus, by recognition of epitopes from 
conserved viral proteins, CTLs may provide cross-strain protection. 
Peptides capable of associating with MHC class I for CTL recognition 
originate from proteins that are present in or pass through the 
cytoplasm or endoplasmic reticulum. In general, exogenous proteins, 

1 0 which enter the endosomal processing pathway (as in the case of 
antigens presented by MHC class II molecules), are not effective at 
generating CD8+ CTL responses. 

Most efforts to generate CTL responses have used 
replicating vectors to produce the protein antigen within the cell or they 

1 5 have focused upon the introduction of peptides into the cytosol. These 
approaches have limitations that may reduce their utility as vaccines. 
Retroviral vectors have restrictions on the size and structure of 
polypeptides that can be expressed as fusion proteins while maintaining 
the ability of the recombinant virus to replicate, and the effectiveness of 

20 vectors such as vaccinia for subsequent immunizations may be 

compromised by immune responses against the vectors themselves. 
Also, viral vectors and modified pathogens have inherent risks that may 
hinder their use in humans. Furthermore, the selection of peptide 
epitopes to be presented is dependent upon the structure of an 

25 individual's MHC antigens and, therefore, peptide vaccines may have 
Hmited effectiveness due to the diversity of MHC haplotypes in outbred 
populations. 

3. PNA Vaccines 

30 Benvenisty, N., and Reshef, L. |PNAS 83, 955 1 -9555, 

(1986)J showed that CaP04-precipitated DNA introduced into mice 
intraperitonealiy (i.p.), intravenously (i.v.) or intramuscularly (i.m.) 
could be expressed. The i.m. injection of DNA expression vectors 
without CaCl2 treatment in mice resulted in the uptake of DNA by the 



wo 97/48370 



PCTAJS97/10517 



muscle cells and expression of the protein encoded by the DNA . The 
plasmids were maintained episomally and did not replicate. 
Subsequently, persistent expression has been observed after i.m. 
injection in skeletal muscle of rats, fish and primates, and cardiac 
5 muscle of rats. The technique of using nucleic acids as therapeutic 
agents was reported in WO90/1 1092 (4 October 1990), in which naked 
polynucleotides were used to vaccinate vertebrates. 

It is not necessary for the success of the method that 
immunization be intramuscular. The introduction of gold 

10 microprojectiles coated with DNA encoding bovine growth hormone 
(BGH) into the skin of mice resulted in production of anti-BGH 
antibodies in the mice. A jet injector has been used to transfect skin, 
muscle, fat, and mammary tissues of living animals. Various methods 
for introducing nucleic have been reviewed. Intravenous injection of a 

15 DNAicationic liposome complex in mice was shown by Zhu et al., 

[Science 2^:209-21 1 (9 July 1993) to result in systemic expression of a 
cloned transgene. Uhner et al., [Science 252:1745-1749, (1 993)] 
reported on the heterologous protection against influenza vims infection 
by intramuscular injection of DNA encoding influenza virus proteins. 

20 The need for specific therapeutic and prophylactic agents 

capable of eliciting desired immune responses against pathogens and 
tumor antigens is met by the instant invention. Of particular importance 
in this therapeutic approach is the ability to induce T-cell immune 
responses which can prevent infections or disease caused even by virus 

25 strains which are heterologous to the strain from which the antigen gene 
was obtained. This is of particular concern when dealing with HIV as 
this virus has been recognized to mutate rapidly and many virulent 
isolates have been identified [see, for example, LaRosa et al.. Science 
242:932-935 (1990), identifying 245 separate HIV isolates]. In 

30 response to this recognized diversity, researchers have attempted to 
generate CTLs based on peptide immunization. Thus, Takahashi et al., 
[Science 255:333-336 (1992)] reported on the induction of broadly 
cross-reactive cytotoxic T cells recognizing an HIV envelope (gpl60) 
determinant. However, those workers recognized the difficulty in 
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achieving a truly cross-reactive CTL response and suggested that there 
is a dichotomy between the priming or restimulation of T cells, which is 
very stringent, and the elicitation of effector function, including 
cytotoxicity, from already stimulated CTLs. 
5 Wang et al. reported on elicitation of inunune responses in 

mice against HIV by intramuscular inoculation with a cloned, genomic 
(unspliced) HIV gene. However, the level of immune responses 
achieved in these studies was very low. In addition, the Wang et al., 
DNA construct utilized an essentially genomic piece of HIV encoding 

10 contiguous Tat/rcv-gp 1 60-Tat/rev coding sequences. As is described in 
detail below, this is a suboptimal system for obtaining high-level 
expression of the gpl60. It also is potentially dangerous because 
expression of Tat contributes to the progression of Kaposi's Sarcoma. 

WO 93/17706 describes a method for vaccinating an animal 

15 against a virus, wherein carrier particles were coated with a gene 

construct and the coated particles are accelerated into cells of an animal, 
in regard to HIV, essentially the entire genome, minus the long terminal 
repeats, was proposed to be used. That method represents substantial 
risks for recipients. It is generally believed that constructs of HIV 

20 should contain less than about 50% of the HTV genome to ensure safety 
of the vaccine; this ensures that enzymatic moieties and viral regulatory 
proteins, many of which have unknown or pooriy understood functions 
have been eliminated. Thus, a number of problems remain if a useful 
human HIV vaccine is to emerge from the gene-dehvery technology. 

25 The instant invention contemplates any of the known 

methods for introducing polynucleotides into living tissue to induce 
expression of proteins. However, this invention provides a novel 
immunogen for introducing HIV and other proteins into the antigen 
processing pathway to efficiently generate HlV-specific CTLs and 

30 antibodies. The pharmaceutical is effective as a vaccine to induce both 
cellular and humoral anti-HIV and HIV neutralizing immune responses. 
In the instant invention, the problems noted above are addressed and 
solved by the provision of polynucleotide immunogens which, when 
introduced into an animal, direct the efficient expression of HIV 
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proteins and epitopes without the attendant risks associated with those 
methods. The immune responses thus generated are effective at 
recognizing HIV, at inhibiting replication of HIV, at identifying and 
kilUng cells infected with HIV, and are cross-reactive against many HIV 
5 strains. 

4. Codon Usage and Codon Context 

The codon pairings of organisms are highly nonrandom, 
and differ from organism to organism. This information is used to 

10 construct and express altered or synthetic genes having desired levels of 
translational efficiency, to determine which regions in a genome are 
protein coding regions, to introduce translational pause sites into 
heterologous genes, and to ascertain relationship or ancestral origin of 
nucleotide sequences. 

15 The expression of foreign heterologous genes in 

transformed organisms is now commonplace. A large number of 
mammalian genes, including, for example, murine and human genes, 
have been successfully inserted into single celled organisms. Standard 
techniques in this regard include introduction of the foreign gene to be 

20 expressed into a vector such as a plasmid or a phage and utilizing that 
vector to insert the gene into an organism. The native promoters for 
such genes are commonly replaced with strong promoters compatible 
with the host into which the gene is inserted. Protein sequencing 
machinery permits elucidation of the amino acid sequences of even 

25 minute quantities of native protein. From these amino acid sequences, 
DNA sequences coding for those proteins can be inferred. DNA 
synthesis is also a rapidly developing art, and synthetic genes 
corresponding to those inferred DNA sequences can be readily 
constructed. 

30 Despite the burgeoning knowledge of expression systems 

and recombinant DNA, significant obstacles remain when one attempts 
to express a foreign or synthetic gene in an organism. Many native, 
active proteins, for example, are glycosylated in a manner different 
from that which occurs when they are expressed in a foreign host. For 
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this reason, eukaryotic hosts such as yeast may be preferred to bacterial 
hosts for expressing many mammalian genes. The glycosylation 
problem is the subject of continuing research. 

Another problem is more poorly understood. Often 
5 translation of a synthetic gene, even when coupled with a strong 

promoter, proceeds much less efficiently than would be expected. The 
same is frequently true of exogenous genes foreign to the expression 
organism. Even when the gene is transcribed in a sufficiently efficient 
manner that recoverable quantities of the translation product are 

10 produced, the protein is often inactive or otherwise different in 
properties from the native protein. 

It is recognized that the latter problem is commonly due to 
differences in protein folding in various organisms. The solution to this 
problem has been elusive, and the mechanisms controlling protein 

15 folding are poorly understood. 

The problems related to translational efficiency are 
believed to be related to codon context effects. The protein coding 
regions of genes in ail organisms are subject to a wide variety of 
functional constraints, some of which depend on the requirement for 

20 encoding a properly functioning protein, as well as appropriate 

translational start and stop signals. However, several features of protein 
coding regions have been discerned which are not readily understood in 
terms of these constraints. Two important classes of such features are 
those involving codon usage and codon context. 

25 It is known that codon utilization is highly biased and varies 

considerably between different organisms. Codon usage patterns have 
been shown to be related to the relative abundance of tRN A 
isoacceptors. Genes encoding proteins of high versus low abundance 
show differences in their codon preferences. The possibility that biases 

30 in codon usage alter peptide elongation rates has been widely discussed. 
While differences in codon use are associated with differences in 
translation rates, direct effects of codon choice on translation have been 
difficult to demonstrate. Other proposed constraints on codon usage 
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patterns include maximizing the fidelity of translation and optimizing 
the kinetic efficiency of protein synthesis. 

Apart from the non-random use of codons, considerable 
evidence has accumulated that codon/anticodon recognition is influenced 
5 by sequences outside the codon itself, a phenomenon termed "codon 
context." There exists a strong influence of nearby nucleotides on the 
efficiency of suppression of nonsense codons as well as missense codons. 
Clearly, the abundance of suppressor activity in natural bacterial 
populations, as well as the use of "termination" codons to encode 

1 0 selenocysteine and phosphoserine require that termination be context- 
dependent. Similar context effects have been shown to influence the 
fidelity of translation, as well as the efficiency of translation initiation. 

Statistical analyses of protein coding regions of E. coli have 
demonstrate another manifestation of "codon context." The presence of 

15 a particular codon at one position strongly influences the frequency of 
occurrence of certain nucleotides in neighboring codons, and these 
context constraints differ markedly for genes expressed at high versus 
low levels. Although the context effect has been recognized, the 
predictive value of the statistical rules relating to preferred nucleotides 

20 adjacent to codons is relatively low. This has limited the utility of such 
nucleotide preference data for selecting codons to effect desired levels 
of translational efficiency. 

The advent of automated nucleotide sequencing equipment 
has made available large quantities of sequence data for a wide variety 

25 of organisms. Understanding those data presents substantial difficulties. 
For example, it is important to identify the coding regions of the 
genome in order to relate the genetic sequence data to protein sequences. 
In addition, the ancestry of the genome of certain organisms is of 
substantial interest. It is known that genomes of some organisms are of 

30 mixed ancestry. Some sequences that are viral in origin are now stably 
incorporated into the genome of eukaryotic organisms. The viral 
sequences themselves may have originated in another substantially 
unrelated species. An understanding of the ancestry of a gene can be 
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important in drawing proper analogies between related genes and their 
translation products in other organisms. 

There is a need for a better understanding of codon context 
effects on translation, and for a method for deteimining the appropriate 
5 codons for any desired translational effect. There is also a need for a 
method for identifying coding regions of the genome from nucleotide 
sequence data. There is also a need for a method for controlling protein 
folding and for insuring that a foreign gene will fold appropriately 
when expressed in a host. Genes altered or constructed in accordance 

10 with desired translational efficiencies would be of significant worth. 

Another aspect of the practice of recombinant DNA 
techniques for the expression by microorganisms of proteins of 
industrial and pharmaceutical interest is the phenomenon of "codon 
preference". While it was earlier noted that the existing machinery for 

1 5 gene expression is genetically transformed host cells will "operate" to 
construct a given desired product, levels of expression attained in a 
microorganism can be subject to wide variation, depending in part on 
specific alternative forms of the amino acid-specifying genetic code 
present in an inserted exogenous gene. A "triplet" codon of four 

20 possible nucleotide bases can exist in 64 variant forms. That these 

forms provide the message for only 20 different amino acids (as well as 
transcription initiation and teimination) means that some amino acids 
can be coded for by more than one codon. Indeed, some amino acids 
have as many as six "redundant", alternative codons while some others 

25 have a single, required codon. For reasons not completely understood, 
altemative codons are not at all uniformly present in the endogenous 
DNA of differing types of cells and there appears to exist a variable 
natural hierarchy or "preference" for certain codons in certain types of 
cells. 

^0 As one example, the amino acid leucine is specified by any 

of six DNA codons including CTA, CTC, CTG, CTT, TTA, and TTG 
(which correspond, respectively, to the mRNA codons, CUA, CUC, 
CUG, CUU, UUA and UUG). Exhaustive analysis of genome codon 
frequencies for microorganisms has revealed endogenous DNA of R 
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£2li most commonly contains the CTG leucine-specifying codon, while 
the DNA of yeasts and slime molds most commonly includes a TTA 
leucine-specifying codon. In view of this hierarchy, it is generally held 
that the likelihood of obtaining high levels of expression of a leucine- 
5 rich polypeptide by an E. coli host will depend to some extent on the 
frequency of codon use. For example, a gene rich in TTA codons will 
in all probabiUty be poorly expressed in E. coli . whereas a CTG rich 
gene will probably highly express the polypeptide. Similarly, when yeast 
cells are the projected transformation host cells for expression of a 

10 leucine-rich polypeptide, a preferred codon for use in an inserted DNA 
would be TTA. 

The implications of codon preference phenomena on 
recombinant DNA techniques are manifest, and the phenomenon may 
serve to explain many prior failures to achieve high expression levels of 

1 5 exogenous genes in successfully transformed host organisms-a less 
"preferred" codon may be repeatedly present in the inserted gene and 
the host cell machinery for expression may not operate as efficiently. 
This phenomenon suggests that synthetic genes which have been 
designed to include a projected host cell's preferred codons provide a 

20 preferred form of foreign genetic material for practice of recombinant 
DNA techniques. 

5. Protein Traffirkinp 

Hie diversity of function that typifies eukaryote cells 
25 depends upon the structural differentiation of their membrane 

boundaries. To generate and maintain these structures, proteins must be 
transported from their site of synthesis in the endoplasmic reticulum to 
predetermined destinations throughout the cell. This requires that the 
trafficking proteins display sorting signals that are recognized by the 
30 molecular machinery responsible for route selection located at the access 
points to the main trafficking pathways. Sorting decisions for most 
proteins need to be made only once as they traverse their biosynthetic 
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pathways since their final destination, the cellular location at which they 
perform their function, becomes their permanent residence. 

Maintenance of intracellular integrity depends in part on 
the selective sorting and accurate transport of proteins to their correct 
5 destinations. Over the past few years the dissection of the molecular 
machinery for targeting and localization of proteins has been studied 
vigorously. Defined sequence motifs have been identified on proteins 
which can act as 'address labels'. A number of sorting signals have been 
found associated with the cytoplasmic domains of membrane proteins. 

10 

SUMMARY OF THE INVRNTTON 

Synthetic polynucleotides comprising a DN A sequence 
encoding a peptide or protein are provided. The DNA sequence of the 
synthetic polynucleotides comprise codons optimized for expression in a 

15 nonhomologous host. The invention is exemplified by synthetic DNA 
molecules encoding HIV env as well as modifications of HIV env. The 
codons of the synthetic molecules include the projected host cell's 
preferred codons. The synthetic molecules provide preferred forms of 
foreign genetic material. The synthetic molecules may be used as a 

20 polynucleotide vaccine which provides effective immunoprophylaxis 
against HIV infection through neutralizing antibody and cell-mediated 
immunity. This invention provides polynucleotides which, when directly 
introduced into a vertebrate in vivo , including mammals such as 
primates and humans, induces the expression of encoded proteins witiiin 

25 the animal. 

BRIEF DESCRTPTTON O F THE PR AWTNP.S 

Figure 1 shows HIV env cassette-based expression 

strategies. 

30 Figure 2 shows DNA vaccine mediated anti-gp 1 20 

responses. 

Figure 3 shows anti-gp 120 ELISA titers of murine DNA 
vaccinee sera. 
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Figure 4 shows the relative expression of gpl20 after HIV 
env PNV cell culture transfection. 

Figure 5 shows the mean anti-gpl20 ELISA responses 
following tPA-gpl43/optA vs. optB DNA vaccination. 
5 Figure 6 shows the neutrahzation of HIV by murine DNA 

vaccinee sera. 

Figure 7 shows HTV neutralization by sera from murine 
HIV env DNA vaccinees. 

Figure 8 is an immunobtot analysis of optimized HIV env 
10 DNA constructs. 

Figure 9 shows anti-gpl20 ELISA responses in ihesus 
monkeys following final vaccination with gpl40 DNA and o-gpl60 
protein. 

Figure 10 shows SHIV neutralizing antibody responses of 
1 5 rhesus monkeys following final vaccination. 

DETAILED DESCRIPTION OF THR INVENTION 

Synthetic polynucleotides comprising a DNA sequence 
encoding a peptide or protein are provided. The DNA sequence of the 

20 synthetic polynucleotides comprise codons optimized for expression in a 
nonhomologous host. The invention is exemplified by synthetic DNA 
molecules encoding HIV env as well as modifications of HIV env are 
provided. The codons of the synthetic molecules include the projected 
host cell's preferred codons. The synthetic molecules provide preferred 

25 forms of foreign genetic material. The synthetic molecules may be used 
as a polynucleotide vaccine which provides inununoprophylaxis against 
HIV infection through neutralizing antibody and cell-mediated 
inununity. This invention provides polynucleotides which, when directly 
introduced into a vertebrate in vivo , including mammals such as 

30 primates and humans, induces the expression of encoded proteins within 
the animal. 

Therefore, synthetic DNA molecules encoding HIV env and 
synthetic DNA molecules encoding modified fonms of HIV env are 
provided. The codons of the synthetic molecules are designed so as to 
35 use the codons preferred by the projected host cell. As noted above, the 
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synthetic molecules of this portion of the invention may be used as a 
polynucleotide vaccine which provides effective immunoprophylaxis 
against HIV infection through neutralizing antibody and cell-mediated 
immunity. The synthetic molecules may be used as an immunogenic 
5 composition. This portion of the invention also provides 

polynucleotides which, when directly introduced into a vertebrate in 
ymt, including mammals such as primates and humans, induces the 
expression of encoded proteins within the animal. 

As used herein, a polynucleotide is a nucleic acid which 

10 contains essential regulatory elements such that upon introduction into a 
living, vertebrate cell, it is able to direct the cellular machinery to 
produce translation products encoded by the genes comprising the 
polynucleotide. In one embodiment of the invention, the polynucleotide 
is a polydeoxyribonucleic acid comprising at least one HIV gene 

15 operatively linked to a transcriptional promoter. In another 
embodiment of the invention, the polynucleotide vaccine (PNV) 
comprises polyribonucleic acid encoding at least one HIV gene which is 
amenable to translation by the eukaryotic cellular machinery 
(ribosomes, tRNAs, and other translation factors). Where the protein 

20 encoded by the polynucleotide is one which does not normally occur in 
that animal except in pathological conditions, (i.e., a heterologous 
protein) such as proteins associated with human immunodeficiency 
viras, (HIV), the etiologic agent of acquired immune deficiency 
syndrome, (AIDS), the animals' immune system is activated to launch a 

25 protective immune response. Because these exogenous proteins are 
produced by the animals' tissues, the expressed proteins are processed 
by the major histocompatibility system, MHC, in a fashion analogous to 
when an actual infection with the related organism (HIV) occurs. The 
result, as shown in this disclosure, is induction of immune responses 

30 against the cognate pathogen. 

Accordingly, the instant inventors have prepared nucleic 
acids which, when introduced into the biological system induce the 
expression of HIV proteins and epitopes. The induced antibody 
response is both specific for the expressed HIV protein, and neutralizes 
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HTV. In addition, cytotoxic T-lymphocytes which specifically recognize 
and destroy HTV infected cells are induced. 

The instant invention provides a method for using a 
polynucleotide which, upon introduction into mammaUan tissue, induces 
5 the expression in a single cell, in vivo , of discrete gene products. The 
instant invention provides a different solution which does not require 
multiple manipulations of rev dependent HIV genes to obtain rev- 
independent genes. The rcv-indcpendent expression system described 
herein is useful in its own right and is a system for demonstrating the 

10 expression in a single cell in vivo of a single desired gene-product. 

Because many of the apphcations of the instant invention 
apply to anti-viral vaccination, the polynucleotides are frequently 
referred to as a polynucleotide vaccine, or PNV. This is not to say that 
additional utiUties of these polynucleotides, in immune stimulation and 

1 5 in anti-tumor therapeutics, are considered to be outside the scope of the 
invention. 

In one embodiment of this invention, a gene encoding an 
HIV gene product is incorporated in an expression vector. The vector 
contains a transcriptional promoter recognized by an eukaryotic RNA 

20 polymerase, and a transcriptional terminator at the end of the HIV gene 
coding sequence. In a preferred embodiment, the promoter is the 
cytomegalovims promoter with the intron A sequence (CMV-intA), 
although those skilled in the art will recognize that any of a number of 
other known promoters such as the strong immunoglobulin, or other 

25 eukaryotic gene promoters may be used. A preferred transcriptional 
terminator is the bovine growth hormone terminator. The combination 
of CMVintA-BGH terminator is particularly preferred. 

To assist in preparation of the polynucleotides in 
prokaryotic cells, an antibiotic resistance marker is also preferably 

30 included in the expression vector under transcriptional control of a 

prokaryotic promoter so that expression of the antibiotic does not occur 
in eukaryotic cells. Ampicillin resistance genes, neomycin resistance 
genes and other pharmaceutically acceptable antibiotic resistance 
markers may be used. To aid in the high level production of the 
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polynucleotide by fermentation in prokaryotic organisms, it is 
advantageous for the vector to contain a prokaryotic origin of 
replication and be of high copy number. A number of commercially 
available prokaryotic cloning vectors provide these benefits. It is 
5 desirable to remove non-essential DNA sequences. It is also desirable 
that the vectors not be able to rephcate in eukaryotic cells. This 
minimizes the risk of integration of polynucleotide vaccine sequences 
into the recipients' genome. Tissue-specific promoters or enhancers 
may be used whenever it is desirable to hmit expression of the 

10 polynucleotide to a particular tissue type. 

In one embodiment, the expression vector pnRSV is used, 
wherein the Rous Sarcoma Virus (RSV) long tenminal repeat (LTR) is 
used as the promoter. In another embodiment, V 1 , a mutated pBR322 
vector into which the CMV promoter and the BGH transcriptional 

15 terminator were cloned is used. In another embodiment, the elements of 
VI and pUC19 have been combined to produce an expression vector 
named VI J. Into VI J or another desirable expression vector is cloned 
an HIV gene, such as gpl20, gp4I, gpl60, gag,pol, env, or any other 
HIV gene which can induce anti-HIV immune responses. In another 

20 embodiment, the ampicillin resistance gene is removed from VI J and 
replaced with a neomycin resistance gene, to generate VlJ-neo into 
different HIV genes have been cloned for use according to this 
invention. In another embodiment, the vector is VI Jns, which is the 
same as VlJneo except that a unique Sfil restriction site has been 

25 engineered into the single Kpn 1 site at position 2 1 1 4 of V 1 J-neo. The 
incidence of Sfil sites in human genomic DNA is very low 
(approximately 1 site per 100,000 bases). Thus, this vector allows 
careful monitoring for expression vector integration into host DNA, 
simply by Sfil digestion of extracted genomic DNA. In a further 

30 refinement, the vector is V 1 R. In this vector, as much non-essential 
DNA as possible was "trinuned" from the vector to produce a highly 
compact vector. This vector is a derivative of VlJns. This vector 
allows larger inserts to be used, with less concern that undesirable 
sequences arc encoded and optimizes uptake by cells. 
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One embodiment of this invention incorporates genes 
encoding HTV gpl60, gpl20, gag and other gene products from 
laboratory adapted strains of HIV such as SF2, fflB or MN. Those 
skilled in the art will recognize that the use of genes from HIV-2 strains 
5 having analogous function to the genes from HIV-1 would be expected 
to generate immune responses analogous to those described herein for 
HIV-1 constructs. The cloning and manipulation methods for obtaining 
these genes are known to those skilled in the art. 

It is recognized that elicitation of immune responses against 

10 laboratory adapted strains of HIV may not be adequate to provide 
neutralization of primary field isolates of HFV. Thus, in another 
embodiment of this invention, genes from vimlent, primary field 
isolates of HIV are incoiporated in the polynucleotide immunogen. This 
is accomplished by preparing cDNA copies of the viral genes and then 

1 5 subcloning the individual genes into the polynucleotide immunogen. 
Sequences for many genes of many HIV strains are now publicly 
available on GENBANK and such primary, field isolates of HFV are 
available from the National Institute of Allergy and Infectious Diseases 
(NIAID) which has contracted with Quality Biological, Inc., [7581 

20 Lindbergh Drive, Gaithersburg, Maryland 20879] to make these strains 
available. Such strains are also available from the World Health 
Organization (WHO) [Network for HIV Isolation and Characterization, 
Vaccine Development Unit, Office of Research, Global Programme on 
AIDS, CH-121 1 Geneva 27, Switzerland]. From this work those skilled 

25 in the art will recognize that one of the utihties of the instant invention 
is to provide a system for in yjm as well as in vitro testing and analysis 
so that a correlation of HFV sequence diversity with serology of HIV 
neutralization, as well as other parameters can be made. Incorporation 
of genes from primary isolates of HIV strains provides an immunogen 

30 which induces immune responses against clinical isolates of the virus and 
thus meets a need as yet unmet in the field. Furthemiore, as the virulent 
isolates change, the immunogen may be modified to reflect new 
sequences as necessary. 



- 17 - 



wo 97/48370 



PCT/US97/10517 



To keep the terminology consistent, the following 
convention is followed herein for describing polynucleotide immunogen 
constructs: "Vector name-HIV strain-gene-additional elements". Thus, 
a construct wherein the gpl60 gene of the MN strain is cloned into the 
5 expression vector VlJneo, the name it is given herein is: "VlJneo-MN- 
gpl60". The additional elements that are added to the construct are 
described in further detail below. As the etiologic strain of the virus 
changes, the precise gene which is optimal for incorporation in the 
pharmaceutical may be changed. However, as is demonstrated below, 

10 because CTL responses arc induced which are capable of protecting 
against heterologous strains, the strain variability is less critical in the 
immunogen and vaccines of this invention, as compared with the whole 
vims or subunit polypeptide based vaccines. In addition, because the 
pharmaceutical is easily manipulated to insert a new gene, this is an 

1 5 adjustment which is easily made by the standard techniques of molecular 
biology. 

The term "promoter" as used herein refers to a recognition 
site on a DNA strand to which the RNA polymerase binds. The 
promoter forms an initiation complex with RNA polymerase to initiate 
20 and drive transcriptional activity. The complex can be modified by 
activating sequences termed "enhancers" or inhibiting sequences termed 
"silencers." 

The term "leader" as used herein refers to a DNA sequence 
at the 5' end of a structural gene which is transcribed along with the 

25 gene. The leader usually results in the protein having an N-terminal 
peptide extension sometimes called a pro-sequence. For proteins 
destined for either secretion to the extracellular medium or a 
membrane, this signal sequence, which is generally hydrophobic, directs 
the protein into endoplasmic reticulum from which it is discharged to 

30 the appropriate destination. 

The term "intron" as used herein refers to a section of 
DNA occurring in the middle of a gene which does not code for an 
amino acid in the gene product. The precursor RNA of the intron is 
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excised and is therefore not transcribed into mRNA nor translated into 
protein. 

The term "cassette" refers to the sequence of the present 
invention which contains tfie nucleic acid sequence which is to be 
5 expressed. The cassette is similar in concept to a cassette tape. Each 
cassette will have its own sequence. Thus by interchanging the cassette 
the vector will express a different sequence. Because of the restrictions 
sites at the 5' and 3' ends, the cassette can be easily inserted, removed or 
replaced with another cassette. 

10 The terai "3' untranslated region" or "3* UTR" refers to 

the sequence at the 3' end of a structural gene which is usually 
transcribed with the gene. This 3' UTR region usually contains the poly 
A sequence. Although the 3' UTR is transcribed from the DNA it is 
excised before translation into the protein. 

15 The terni "Non-Coding Region" or "NCR" refers to the 

region which is contiguous to the 3' UTR region of the structural gene. 
The NCR region contains a transcriptional termination signal. 

The term "restriction site" refers to a sequence specific 
cleavage site of restriction endonucleases. 

20 The term "vector" refers to some means by which DNA 

fragments can be introduced into a host organism or host tissue. There 
are various types of vectors including plasmid, bacteriophages and 
cosmids. 

The term "effective amount" means sufficient PNV is 
25 injected to produce the adequate levels of the polypeptide. One skilled in 
the art recognizes that this level may vary. 

To provide a description of the instant invention, the 
following background on HIV is provided. The human 
immunodeficiency virus has a ribonucleic acid (RNA) genome. This 
30 RNA genome must be reverse transcribed according to methods known 
in the art in order to produce a cDNA copy for cloning and 
manipulation according to the methods taught herein. At each end of 
the genome is a long terminal repeat which acts as a promoter. Between 
these termini, the genome encodes, in various reading frames, gag-pol- 
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env as the major gene products: gag is the group specific antigen; pol is 
the reverse transcriptase, or polymerase; also encoded by this region, in 
an alternate reading frame, is the viral protease which is responsible for 
post-translational processing, for example, of gpl60 into gpl20 and 
5 gp41; env is the envelope protein; vif is the virion infectivity factor; rev 
is the regulator of virion protein expression; neg is the negative 
regulatory factor; vpu is the virion productivity factor "u"; tat is the 
trans-activator of transcription; vpr is the viral protein r. The function 
of each of these elements has been described. 

10 In one embodiment of this invention, a gene encoding an 

HIV or SrV protein is directly linked to a transcriptional promoter. 
The env gene encodes a large, membrane bound protein, gpl60, which 
is post-translationally modified to gp41 and gpl20. The gpl20 gene 
may be placed under the control of the cytomegalovirus promoter for 

15 expression. However, gpl20 is not membrane bound and therefore, 
upon expression, it may be secreted from the cell. As HIV tends to 
remain dormant in infected cells, it is desirable that immune responses 
directed at cell-bound HIV epitopes also be generated. Additionally, it 
is desirable that a vaccine produce membrane bound, oligomeric ENV 

20 antigen similar in structure to that produced by viral infection in order 
to generate the most efficacious antibody responses for viral 
neutraUzation. This goal is accomplished herein by expression in vivo 
of a secreted gpl40 epitope (gpl40 > gpl20 + ectodomain of gp41) or 
the cell-membrane associated epitope, gpl60, to prime the immune 

25 system. However, expression of gpl60 is repressed in the absence of 
rev due to non-export from the nucleus of non-spliced genes. For an 
understanding of this system, the life cycle of HIV must be described in 
further detail. 

In the life cycle of HIV, upon infection of a host cell, HIV 
30 RNA genome is reverse-transcribed into a proviral DNA which 

integrates into host genomic DNA as a single transcriptional unit. The 
LTR provides the promoter which transcribes HIV genes from the 5* to 
3' direction (gag, pol, env), to form an unspliced transcript of the entire 
genome. The unspliced transcript functions as the mRNA from which 
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gag and pol are translated, while limited splicing must occur for 
translation of cnv encoded genes. For the regulatory gene product rev 
to be expressed, more than one splicing event must occur because in the 
genomic setting, rev and env overlap. In order for transcription of env 
5 to occur, rev transcription must stop, and vice versa. In addition, the 
presence of rev is required for export of unspliced RNA from the 
nucleus. For rev to function in this manner, however, a rev responsive 
element (RRE) must be present on the transcript [Malim et al.. Nature 
128:254-257 (1989)]. 

10 In the polynucleotide vaccine of this invention, the 

obligatory splicing of certain HIV genes is eHminated by providing fully 
spliced genes (i.e.: the provision of a complete open reading frame for 
the desired gene product without the need for switches in the reading 
frame or ehmination of noncoding regions; those of ordinary skill in the 

15 art would recognize that when splicing a particular gene, there is some 
latitude in the precise sequence that results; however so long as a 
functional coding sequence is obtained, this is acceptable). Thus, in one 
embodiment, the entire coding sequence for gpl60 is spliced such that 
no intermittent expression of each gene product is required. 

20 The dual humoral and cellular immune responses generated 

according to this. invention are particularly significant to inhibiting HIV 
infection, given the propensity of HIV to mutate within the population, 
as well as in infected individuals. In order to formulate an effective 
protective vaccine for HIV it is desirable to generate both a multivalent 

25 antibody response for example to gpl60 (env is approximately 80% 

conserved across various HIV-1, clade B strains, which are the prevalent 
strains in US human populations), the principal neutralization target on 
HIV, as well as cytotoxic T cells reactive to the conserved portions of 
gpl60 and, internal viral proteins encoded by gag. We have made an 

30 HIV vaccine comprising gpl60 genes selected from common laboratory 
strains; from predominant, primary viral isolates found within the 
infected population; from mutated gpl60s designed to unmask cross- 
strain, neutralizing antibody epitopes; and from other representative 
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HIV genes such as the gag and pol genes (-95% conserved across HIV 
isolates. 

Virtually all HIV seropositive patients who have not 
advanced towards an immunodeficient state harbor anti-^ag CTLs while 
5 about 60% of these patients show cross-strain, gpl60-specific CTLs. 
The amount of HIV specific CTLs found in infected individuals that 
have progressed on to the disease state known as AIDS, however, is 
much lower, demonstrating the significance of our findings that we can 
induce cross-strain CTL responses. 

JO Immune responses induced by our cnv and;?a^ 

polynucleotide vaccine constructs are demonstrated in mice and 
primates. Monitoring antibody production to env in mice allows 
confiiroation that a given construct is suitably immunogenic, i.e., a high 
proportion of vaccinated animals show an antibody response. Mice also 

15 provide the most facile animal model suitable for testing CTL induction 
by our constructs and are therefore used to evaluate whether a 
particular constnict is able to generate such activity. Monkeys (African 
green, rhesus, chimpanzees) provide additional species including 
primates for antibody evaluation in larger, non-n)dent animals. These 

20 species are also preferred to mice for antisera neutralization assays due 
to high levels of endogenous neutralizing activities against retroviruses 
observed in mouse sera. These data demonstrate that sufficient 
inmiunogenicity is engendered by our vaccines to achieve protection in 
experiments in a chimpanzee/HIVmB challenge model based upon 

25 known protective levels of neutralizing antibodies for this system. 

However, the currently emerging and increasingly accepted definition of 
protection in the scientific community is moving away from so-called 
"sterilizing immunity", which indicates complete protection from HIV 
infection, to prevention of disease. A number of correlates of this goal 

30 include reduced blood viral titer, as measured either by HIV reverse 
transcriptase activity, by infectivity of samples of serum, by ELISA 
assay of p24 or other HIV antigen concentration in blood, increased 
CD4+ T-cell concentration, and by extended survival r^tes [see, for 
example, Cohen, J., SeieOfiS 262:1820-1821, 1993, for a discussion of 
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the evolving definition of anti-HTV vaccine efficacy]. The inununogens 
of the instant invention also generate neutralizing immune responses 
against infectious (clinical, primary field) isolates of HIV. 

5 irmfftfnQipgy 

A. Antibody Responses to env. 

1. gp 160 and ppl2Q. An ELISA assay is used to 
determine whether vaccine vectors expressing either secreted gpl20 or 
membrane-bound gp 160 are efficacious for production of cnv-specific 

10 antibodies. Initial in yitcQ characterization of £/iv expression by our 
vaccination vectors is provided by immunoblot analysis of gpl60 
transfected cell lysates. These data confimi and quantitate gpl60 
expression using anti-gp41 and anti-gpl20 monoclonal antibodies to 
visuahze transfectant cell gpl60 expression, hi one embodiment of this 

15 invention, gpl60 is preferred to gpl20 for the following reasons: (1) 
an initial gpl20 vector gave inconsistent immunogenicity in mice and 
was very poorly or non-responsive in African green Monkeys; (2) 
gpl60 contributes additional neutralizing antibody as well as CTL 
epitopes by providing the addition of approximately 190 amino acid 

20 residues due to the inclusion of gp41; (3) gpl60 expression is more 
similar to viral env with respect to tetramer assembly and overall 
conformation, which may provide oligomer-dependent neutralization 
epitopes; and (4) we find that, like the success of membrane-bound, 
influenza HA constructs for producing neutralizing antibody responses 

25 in mice, ferrets, and nonhuman primates [see Ulmer et al.. Science 
2^:1745-1749, 1993; Montgomery, D., et al., DNA and Cell Biol. 
12:777-783, 1993] anti-gpl60 antibody generation is superior to anti- 
gpl20 antibody generation. Selection of which type of env , or whether 
a cocktail of env subfragments, is preferred is determined by the 

30 experiments outlined below. 

2. Presence and Breadth of Neutralizing Activity 
ELISA positive antisera from monkeys is tested and shown to neutralize 
both homologous and heterologous HIV strains. 
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3. V3 VS. non-V3 Neutraliring Anfjl^^i^^ A major 
goal for env PNVs is to generate broadly neutralizing antibodies. It has 
now been shown that antibodies directed against V3 loops arc very 
strain specific, and the serology of this response has been used to define 

5 strains. 

a. Non-V3 neutralizing antibodies appear to 
primarily recognize discontinuous, stmctural epitopes within gpl20 
which are responsible for CD4 binding. Antibodies to this domain are 
polyclonal and more broadly cross-neutralizing probably due to 

10 restraints on mutations imposed by the need for the vims to bind its 
cellular ligand. An in vitro assay is used to test for blocking gpl20 
binding to CD4 immobilized on 96 well plates by sera from immunized 
animals. A second in vitro assay detects direct antibody binding to 
synthetic peptides representing selected V3 domains inmiobilized on 

15 plastic. These assays are compatible for antisera from any of the animal 
types used in our studies and define the types of neutralizing antibodies 
our vaccines have generated as well as provide an in vitro correlate to 
virus neutralization. 

b. gp41 harbors at least one major neutralization 
20 determinant, corresponding to the highly conserved linear epitope 

recognized by the broadly neutralizing 2F5 monoclonal antibody 
(conmiercially available from Viral Testing Systems Corp., Texas 
Commerce Tower, 600 Travis Street, Suite 4750, Houston, TX 77002- 
3005(USA), or Waldheim Pharaiazeutika GmbH, Boltzmangasse 11, A- 

25 1091 Wien, Austria), as well as other potential sites including the well- 
conserved "fusion peptide" domain located at the N-temiinus of gp41. 
Besides the detection of antibodies directed against gp41 by immunoblot 
as described above, an in vitro assay test is used for antibodies which 
bind to synthetic peptides representing these domains immobihzed on 

30 plastic. 

4. Maturation of the Antibod y Response In HIV 
seropositive patients, die neutralizing antibody responses progress from 
chiefly anti-V3 to include more broadly neutralizing antibodies 
comprising the stmctural gpl20 domain epitopes described above (#3), 
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including gp41 epitopes. These types of antibody responses are 
monitored over the course of both time and subsequent vaccinations. 

B. T Cell Reactivities Against env and pap. 
5 1. Generation of CTL Responses Viral proteins which 

are synthesized within cells give rise to MHC I-restricted CTL 
responses. Each of these proteins elicits CTL in seropositive patients. 
Our vaccines also are able to elicit CTL in mice. The inununogenetics 
of mouse strains are conducive to such studies, as demonstrated with 

10 influenza MP, [see Ulmer et al.. Science 25^:1745-1749, 1993]. Several 
epitopes have been defined for the HIV proteins env, rev, nef and gag 
in Balb/c mice, thus facihtating in vitro CTL culmre and cytotoxicity 
assays. It is advantageous to use syngeneic mmor lines, such as the 
murine mastocytoma P815, transfected with these genes to provide 

15 targets for CTL as well as for in vitro antigen specific restimulation. 
Methods for defining immunogens capable of eliciting MHC class I- 
restricted cytotoxic T lymphocytes are known fsee Calin-Laurcns, et al., 
Ya££in£li(9):974-978, 1993; see particularly Eriksson, et al.. Vaccine 
JLI(8):859-865, 1993, wherein T-cell activating epitopes on the HIV 

20 gpl20 were mapped in primates and several regions, including gpl20 
amino acids 142-192. 296-343, 367-400, and 410-453 were each found 
to induce lymphoproliferation; furthermore, discrete regions 248-269 
and 270-295 were lymphoproliferative. A peptide encompassing amino 
acids 152-176 was also found to induce HIV neutralizing antibodies], 

25 and these methods may be used to identify immunogenic epitopes for 
inclusion in the PNV of this invention. Alternatively, the entire gene 
encoding gpl60, gpl20, protease, or gag could be used. For additional 
review on this subject, see for example, Shirai et al., J. Immunol 
14a:1657-1667, 1992; Choppin et al., J. Immunol 147 :569-574. 1991; 

30 Choppin et al., J. Immunol 142:575-583, 1 99 1 ; Berzofsky et al., L 

Clin. Invest. £8:876-884, 1991. As used herein, T-cell effector function 
is associated with mature T-cell phenotype, for example, cytotoxicity, 
cytokine secretion for B-cell activation, and/or recruitment or 
stimulation of macrophages and neutrophils. 
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2. Measurement of A^fivitif g Spleen cell cultures 
derived from vaccinated animals are tested for recall to specific antigens 
by addition of either recombinant protein or peptide epitopes. 
Activation of T cells by such antigens, presented by accompanying 

5 splenic antigen presenting cells, APCs, is monitored by proliferation of 
these cultures or by cytokine production. The pattern of cytokine 
production also allows classification of Th response as type 1 or type 2. 
Because dominant Th2 responses appear to correlate with the exclusion 
of cellular immunity in immunocompromised seropositive patients, it is 
10 possible to define the type of response engendered by a given PNV in 
patients, permitting manipulation of the resulting immune responses. 

3. Pclaved Tvoe Hvpersensirivity fpjfl) DTH to viral 
antigen after i.d. injection is indicative of cellular, primarily MHC II- 
restricted, immunity. Because of the commercial availability of 

1 5 recombinant HIV proteins and synthetic peptides for known epitopes, 
DTH responses are easily determined in vaccinated vertebrates using 
these reagents, thus providing an additional in vivo correlate for 
inducing cellular immunity. 

20 Protection 

Based upon the above immunologic studies, it is predictable 
that our vaccines arc effective in vertebrates against challenge by 
vimlent HIV. These studies are accompHshed in an 
HlVnm/chimpanzee challenge model after sufficient vaccination of 

25 these animals with a PNV construct, or a cocktail of PNV constructs 
comprised of gpl60inB, gagim, nefniB and REVuiB. The lUB 
strain is useful in this regard as the chimpanzee titer of lethal doses of 
this strain has been established. However, the same studies are 
envisioned using any strain of HIV and the epitopes specific to or 

30 heterologous to the given strain. A second vaccination/challenge model, 
in addition to chimpanzees, is the scid-hu PBL mouse. This model 
allows testing of the human lymphocyte immune system and our vaccine 
with subsequent HIV challenge in a mouse host. This system is 
advantageous as it is easily adapted to use with any HIV strain and it 
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provides evidence of protection against multiple strains of primary field 
isolates of HIV. A third challenge model utilizes hybrid HIV/SIV 
viruses (SHIV), some of which have been shown to infect ihesus 
monkeys and lead to immunodeficiency disease resulting in death [see 
5 Li, J., et ai., J. AIDS ^:639-646, 1 992]. Vaccination of rhesus with our 
polynucleotide vaccine constmcts is protective against subsequent 
challenge with lethal doses of SHIV. 

PNV Construct Summary 

10 HIV and other genes are ligated into an expression vector 

which has been optimized for polynucleotide vaccinations. Essentially 
all extraneous DNA is removed, leaving the essential elements of 
transcriptional promoter, immunogenic epitopes, transcriptional 
terminator, bacterial origin of replication and antibiotic resistance gene. 

15 Expression of HIV late genes such as env and gag is rev- 

dependent and requires that the rev response element (RRE) be present 
on the viral gene transcript. A secreted form of gpl20 can be generated 
in the absence of rev by substitution of the gpl20 leader peptide with a 
heterologous leader such as from tPA (tissue-type plasminogen 

20 activator), and preferably by a leader peptide such as is found in highly 
expressed mammahan proteins such as immunoglobulin leader peptides. 
We have inserted a tPA-gpl20 chimeric gene into VlJns which 
efficiently expresses secreted gpl20 in transfected cells (RD, a human 
rhabdomyosarcoma Une). Monocistronic gpl60 does not produce any 

25 protein upon transfection without the addition of a rev expression 
vector. 

Representative Constnict Components Inclu de (but are not restricted io): 
1. tPA-gpl20MN: 
30 2. gpl60niB; 

3. gagmB' for anti-gag CTL; 

4. tPA-gpl20lllB; 

5. tPA-gpl40 
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6. tPA-gp 160 with Structural mutations: VI, V2, and/or 
V3 loop deletions or substitutions 

7. Genes encoding antigens expressed by pathogens other 
than HIV, such as, but not limited to, influenza virus 

5 nucleoprotein, hemagglutinin, matrix, neuraminidase, and 

other antigenic proteins; herpes simplex vims genes; human 
papillomavirus genes; tuberculosis antigens; hepatitis A, B, 
or C virus antigens. 

1 0 The protective efficacy of polynucleotide HIV immunogens 

against subsequent viral challenge is demonstrated by immunization with 
the non-rephcating plasmid DNA of this invention. This is 
advantageous since no infectious agent is involved, assembly of virus 
particles is not required, and determinant selection is permitted. 

15 Furthermore, because the sequence of and protease and several of 
the other viral gene products is conserved among various strains of 
HIV, protection against subsequent challenge by a virulent strain of HIV 
that is homologous to, as well as strains heterologous to the strain from 
which the cloned gene is obtained, is enabled. 

20 The i.m. injection of a DNA expression vector encoding 

gpl60 results in the generation of significant protective immunity 
against subsequent viral challenge. In particular, gpl60-specific 
antibodies and primary CTLs are produced. Immune responses directed 
against conserved proteins can be effective despite the antigenic shift and 

25 drift of the variable envelope proteins. Because each of the HIV gene 
products exhibit some degree of conservation, and because CTL are 
generated in response to intracellular expression and MHC processing, it 
is predictable that many virus genes give rise to responses analogous to 
that achieved for gpl 60. Thus, many of these genes have been cloned, 

30 as shown by the cloned and sequenced junctions in the expression vector 
(see below) such that these constructs are immunogenic agents in 
available form. 

The invention offers a means to induce cross-strain 
protective immunity without the need for self-replicating agents or 
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adjuvants. In addition, immunization with the instant polynucleotides 
offers a number of other advantages. This approach to vaccination 
should be applicable to tumors as well as infectious agents, since the 
CD8+ CTL response is important for both pathophysiological processes 
5 [K. Tanaka et ai, Annu. Rev. Immunol. 6, 359 (1988)]. Therefore, 
eliciting an immune response against a protein crucial to the 
transformation process may be an effective means of cancer protection 
or immunotherapy. The generation of high titer antibodies against 
expressed proteins after injection of viral protein and human growth 
10 hormone DNA suggests that this is a facile and highly effective means of 
making antibody-based vaccines, either separately or in combination 
with cytotoxic T-tymphocyte vaccines targeted towards conserved 
antigens. 

The ease of producing and purifying DNA constmcts 

15 compares favorably with traditional methods of protein purification, 
thus facilitating the generation of combination vaccines. Accordingly, 
multiple constructs, for example encoding gpl60, gpl20, gp41, or any 
other HIV gene may be prepared, mixed and co-administered. Because 
protein expression is maintained following DNA injection, the 

20 persistence of B- and T-cell memory may be enhanced, thereby 
engendering long-lived humoral and cell-mediated immunity. 

Standard techniques of molecular biology for preparing and 
purifying DNA constructs enable the preparation of the DNA 
immunogens of this invention. While standard techniques of molecular 

25 biology are therefore sufficient for the production of tiie products of 
this invention, the specific constructs disclosed herein provide novel 
polynucleotide immunogens which surprisingly produce cross-strain and 
primary HIV isolate neutralization, a result heretofore unattainable with 
standard inactivated whole virus or subunit protein vaccines. 

30 The amount of expressible DNA or transcribed RNA to be 

introduced into a vaccine recipient will depend on the strength of the 
transcriptional and translational promoters used and on the 
immunogenicity of the expressed gene product. In general, an 
immunologically or prophylactically effective dose of about 1 ng to 100 
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mg, and preferably about 10 jig to 300 pg is administered directly into 
muscle tissue. Subcutaneous injection, intrademial introduction, 
impression through the skin, and other modes of administration such as 
intraperitoneal, intravenous, or inhalation delivery are also 
5 contemplated. It is also contemplated that booster vaccinations are to be 
provided. Following vaccination with HIV polynucleotide immunogen, 
boosting with HIV protein immunogens such as gpl60, gpl20, and gag 
gene products is also contemplated. Parenteral administration, such as 
intravenous, intramuscular, subcutaneous or other means of 

10 administration of interleukin-12 protein or GM-CSF or similar proteins 
alone or in combination, concurrently with or subsequent to parenteral 
introduction of the PNV of this invention is also advantageous. 

The polynucleotide may be naked, that is, unassociated with 
any proteins, adjuvants or other agents which impact on the recipients' 

1 5 immune system. In this case, it is desirable for the polynucleotide to be 
in a physiologically acceptable solution, such as, but not limited to, 
sterile saline or sterile buffered saline. Altematively, the DNA may be 
associated with liposomes, such as lecithin liposomes or other liposomes 
known in the art, as a DNA-liposome mixture, or the DNA may be 

20 associated with an adjuvant known in the art to boost immune responses, 
such as a protein or other carrier. Agents which assist in the cellular 
uptake of DNA, such as, but not limited to, calcium ions, may also be 
used to advantage. These agents are generally referred to herein as 
transfection facilitating reagents and pharmaceutically acceptable 

25 carriers. Techniques for coating microprojectiles coated with 

polynucleotide arc known in the art and are also useful in connection 
with this invention. 

The following examples are offered by way of illustration 
and are not intended to limit the invention in any manner. 
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EXAMPLE 1 

Materials descriptions 

Vectors pF41 1 and pF412: These vectors were subcloned from 
vector pSP62 which was constructed in R. Gallo's lab. pSP62 is an 
5 available reagent from Biotech Research Laboratories, Inc. pSP62 has a 
12.5 kb Xbal fragment of the HXB2 genome subcloned from lambda 
HXB2. Sail and Xba I digestion of pSP62 yields to HXB2 fragments: 
5'-XbaI/SalI. 6.5 kb and 3 - Sall/Xbal, 6 kb. These inserts were 
subcloned into pUC 18 at Smal and Sail sites yielding pF41 1 (5- 
1 0 Xbal/Sall) and pF41 2 (3'-XbaI/SalI). pF4 1 1 contains gaglpol and pF4 1 2 
contains tat/rev/e«vAief. 

Repligen reagents: 

recombinant rev (IIIB), #RP1024-10 
15 rec. gpl20 (ffiB), #RP1001-10 

anti-rev monoclonal antibody, #RP1029-10 
anti-gpl20 mAB, #1C1, #RP1010-10 

AIDS Research and Reference Reapent Program: 

20 anti-gp4 1 mAB hybridoma, Chessie 8, #526 

The strategies are designed to induce both cytotoxic T 
lymphocyte (CTL) and neutralizing antibody responses to HIV, 
principally directed at the HIV gag (-95% conserved) and env (gpl60 
or gp 120; 70-80% conserved) gene products, gpl 60 contains the only 

25 known neutralizing antibody epitopes on the HIV particle while the 
importance of anti-cnv and anti-gag CTL responses are highlighted by 
the known association of the onset of these cellular inununities with 
clearance of primary viremia following infection, which occurs prior to 
the appearance of neutralizing antibodies, as well as a role for CTL in 

30 maintaining disease-free status. Because HIV is notorious for its genetic 
diversity, we hope to obtain greater breadth of neutralizing antibodies 
by including several representative env genes derived from clinical 
isolates and gp41 (~90% conserved), while the highly conserved gag 
gene should generate broad cross-strain CTL responses. 
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EXAMPLE I 
Heterologous Expression of HTV I^t e Gene Products 

HIV structural genes such as en\ and gag require 
expression of the HIV regulatory gene, rev, in order to efficiently 
5 produce full-length proteins. We have found that r^v-dependent 
expression of gag yielded low levels of protein and that rev itself may 
be toxic to cells. Although we achieved relatively high levels of rev- 
dependent expression of gpl60 in vitro this vaccine elicited low levels of 
antibodies to gpl60 following in vivo inununization with rev/gpl60 

10 DNA. This may result from known cytotoxic effects of rev as well as 
increased difficulty in obtaining rev function in myotubules containing 
hundreds of nuclei {rev protein needs to be in the same nucleus as a rev- 
dependent transcript for gag or env protein expression to occur). 
However, it has been possible to obtain rev-independent expression 

15 using selected modifications of the env gene. Evaluation of these 
plasmids for vaccine purposes is underway. 

In general, our vaccines have utilized primarily HIV (IIIB) 
env and gag genes for optimization of expression within our generalized 
vaccination vector, VlJns, which is comprised of a CMV immediate- 

20 early (IE) promoter, BGH polyadenylation site, and a pUC backbone. 
Varying efficiencies, depending upon how large a gene segment is used 
(e.g., gpl20 vs. gpl60), of rev-independent expression may be achieved 
for env by replacing its native secretory leader peptide with that from 
the tissue-specific plasminogen activator (tPA) gene and expressing the 

25 resulting chimeric gene behind the CMVIE promoter with the CMV 
intron A. tPA-gpl20 is an example of a secreted gpl20 vector 
constructed in this fashion which functions well enough to elicit anti- 
gpl20 immune responses in vaccinated mice and monkeys. 

Because of reports that membrane-anchored proteins may 

30 induce much more substantial (and perhaps more specific for HIV 

neutralization) antibody responses compared to secreted proteins as well 
as to gain additional immune epitopes, we prepared VlJns-tPA-gpl60 
and VlJns-rev/gpl60. The tPA-gpl60 vector produced detectable 
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quantities of gpl60 and gpl20, without the addition of rev, as shown by 
immunoblot analysis of transfected cells, although levels of expression 
were much lower than that obtained for rcv/gpl60, a rev-dependent 
gpl60-expressing plasmid. This is probably because inhibitory regions 
5 (designated INS), which confer rev dependence upon the gpl60 

transcript, occur at multiple sites within gpl60 including at the COOH- 
tenninus of gp41 (see Figure 1 for schematic view of gpl43 construct 
strategies). A vector was prepared for a COOH-terminally truncated 
form of tPA-gpl60, tPA-gpl43, which was designed to increase the 

1 0 overall expression levels of env by elimination of these inhibitory 

sequences. The gpl43 vector also eliminates intracellular gp41 regions 
containing peptide motifs (such as leu-leu) known to cause diversion of 
membrane proteins to the lysosomes rather than the cell surface. Thus, 
gpl43 may be expected to increase both expression of the env protein 

15 (by decreasing rev-dependence) and the efficiency of transport of 
protein to the cell surface compared to iiill-length gpl60 where these 
proteins may be better able to elicit anti-gpl60 antibodies following 
DNA vaccination. tPA-gpl43 was ftirther modified by extensive silent 
mutagenesis of the rev response element (RRE) sequence (350 bp) to 

20 eliminate additional inhibitory sequences for expression. This construct, 
gpl43/mutRRE, was prepared in two foraas: either eliminating (form 
A) or retaining (form B) proteolytic cleavage sites for gp 120/41. Both 
forms were prepared because of literature reports that vaccination of 
mice using uncleavable gpl60 expressed in vaccinia elicited much higher 

25 levels of antibodies to gpl60 than did cleavable fomis. 

A quantitative ELISA for gpl60/gpl20 expression in cell 
transfectants was developed to determine the relative expression 
capabilities for these vectors. In vitro transfection of 293 cells followed 
by quantification of cell-associated vs. secreted/released gpl20 yielded 

30 the following results: (1 ) tPA-gpl60 expressed 5-lOX less gpl 20 than 
rev/gpl60 with similar proportions retained intracellularly vs. 
trafficked to the cell surface; (2) tPA-gpl43 gave 3-6X greater secretion 
of gpl 20 than rev/gpl60 with only low levels of cell-associated gpl43, 
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confirming that the cytoplasmic tail of gpl60 causes intracellular 
retention of gpl60 which can be overcome by partial deletion of this 
sequence; and, (3) tPA-gpl43/mutRRE A and B gave -lOX greater 
expression levels of protein than did parental tPA-gpl43 while 
5 elimination of proteolytic processing was confirmed for form A. 
Figure 4 presents representative data supporting points (1) - (3). 

Thus, our strategy to increase rev-independent expression 
has yielded stepwise increases in overall expression as well as 
redirecting membrane-anchored gpl43 to the cell surface away from 

10 lysosomes. It is important to note that it should be possible to insert 
gpl20 sequences derived from various viral isolates within a vector 
cassette containing these modifications which reside either at the NH2- 
teiminus (tPA leader) or COOH-terminus (gp41), where few antigenic 
differences exist between different viral strains. In other words, this is 

15 a generic construct which can easily be modified by inserting gpl20 
derived from various primary viral isolates to obtain clinically relevant 
vaccines. 

To apply these expression strategies to vimses that arc 
relevant for vaccine purposes and confirm the generahty of our 

20 approaches, we also prepared a tPA-gpl20 vector derived from a 
primary HIV isolate (containing the North American consensus V3 
peptide loop; macrophage-tropic and nonsyncytia-inducing phenotypes). 
This vector gave high expression/secretion of gpl20 with transfected 
293 cells and elicited anti-gpl20 antibodies in mice demonstrating that it 

25 was cloned in a functional foim. Primary isolate gpl60 genes will also 
be used for expression in the same way as for gpl60 derived from 
laboratory strains. 

EXAMPLE ^ 

30 Immune Responses to HIV- 1 gnv Polvnn rieotide Vacdnps - 

African green (AGM) and Rhesus (RHM) monkeys which 
received gpl20 DNA vaccines showed low levels of neutializing 
antibodies following 2-3 vaccinations, which could not be increased by 
additional vaccination. These results, as well as increasing awareness 
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within the HTV vaccine field that oHgomeric gpl60 is probably a more 
relevant target antigen for eliciting neutralizing antibodies than gpl20 
monomers (Moore and Ho, J. Virol. 67 \ 863 (1993)), have led us to 
focus upon obtaining effective expression of gpl60-based vectors (see 
5 above). Mice and AGM were also vaccinated with the primary isolate 
derived tPA-gpl20 vaccine. These animals exhibited anti-V3 peptide 
(using homologous sequence) reciprocal endpoint antibody titers 
ranging from 500-5000, demonstrating that this vaccine design is 
functional for clinically relevant viral isolates. 

10 The gpl60-based vaccines, rcv-gpl60 and tPA-gpl60, 

failed to consistently elicit antibody responses in mice and nonhuman 
primates or yielded low antibody titers. Our initial results with the 
tPA-gpl43 plasmid yielded geometric mean titers > 10^ in mice and 
AGM following two vaccinations. These data indicate that we have 

1 5 significantly improved the immunogenicity of gpl 60-like vaccines by 
increasing expression levels. This construct, as well as the tPA- 
gpl43/mutRRE A and B vectors, will continue to be characterized for 
antibody responses, especially for virus neutralization. 

Significantly, gpl 20 DNA vaccination produced potent 

20 helper T cell responses in all lymphatic compartments tested (spleen, 
blood, inguinal, mesenteric, and iliac nodes) with THl-like cytokine 
secretion profiles (i.e., g-interferon and IL-2 production with little or 
no IL-4). These cytokines generally promote strong cellular immunity 
and have been associated with maintenance of a disease-free state for 

25 HIV -seropositive patients. Lymph nodes have been shown to be 

primary sites for HIV replication, harboring large reservoirs of virus 
even when virus cannot be readily detected in the blood. A vaccine 
which can elicit anti-HIV immune responses at a variety of lymph sites, 
such as we have shown with our DNA vaccine, may help prevent 

30 successful colonization of the lymphatics following initial infection. 

As stated previously, we consider realization of the 
following objectives to be essential to maximize our chances for success 
with this program: (1) env-based vectors capable of generating stronger 
neutralizing antibody responses in primates; (2) ^ag and e/iv vectors 
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which elicit strong T-lymphocyte responses as characterized by CTL 
and helper effector functions in primates; (3) use of env and gag genes 
from clinically relevant HIV-1 strains in our vaccines and 
characterization of the inununologic responses, especially neutralization 
5 of primary isolates, they elicit; (4) demonstration of protection in an 
animal challenge model such as chimpanzee/HIV (mB) or rhesus/SHIV 
using appropriate optimized vaccines; and, (5) determinarion of the 
duration of immune responses appropriate to clinical use. Significant 
progress has been made on the first three of these objectives and 
10 experiments are in progress to determine whether our recent 

vaccination constmcts for gpl60 and gag will improve upon these initial 
results. 

EXAMPf.R4 

15 Vectors For Vaccine Production 

A. VlJneo EXPRESSION VRrrnp [p ] . 

It was necessary to remove the ampr gene used for 
antibiotic selection of bacteria harboring VI J because ampicillin may 
not be used in large-scale fermenters. The ampf gene from the pUC 

20 backbone of VI J was removed by digestion with Sspl and Eaml 1051 
restriction enzymes. The remaining plasmid was purified by agarose 
gel electrophoresis, blunt-ended with T4 DISIA polymerase, and then 
treated with calf intestinal alkaline phosphatase. The commercially 
available kan«" gene, derived from transposon 903 and contained within 

25 the pUC4K plasmid, was excised using the PstI restriction enzyme, 
purified by agarose gel electrophoresis, and blunt-ended with T4 DNA 
polymerase. This fragment was ligated with the VI J backbone and 
plasmids with the kani" gene in either orientation were derived which 
were designated as VlJneo #'s 1 and 3. Each of these plasmids was 

30 confirmed by restriction enzyme digestion analysis, DNA sequencing of 
the junction regions, and was shown to produce similar quantities of 
plasmid as VI J. Expression of heterologous gene products was also 
comparable to VI J for these VlJneo vectors. We arbitrarily selected 
VlJneo#3, referred to as VlJneo hereafter (SEQ ID:1), which contains 
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the kan^ gene in the same orientation as the amp^ gene in VI J as the 
expression construct. 

B. VlJns Expression Vector: 
5 An Sfi I site was added to VlJneo to facilitate integration 

smdies. A commercially available 13 base pair Sfi I linker (New 
England BioLabs) was added at the Kpn I site within the BGH sequence 
of the vector. VlJneo was hnearized with Kpn I, gel purified, blunted 
by T4 DNA polymerase, and ligated to the blunt Sfi I linker. Clonal 
10 isolates were chosen by restriction mapping and verified by sequencing 
through the linker. The new vector was designated VlJns. Expression 
of heterologous genes in VlJns (with Sfi I) was comparable to 
expression of the same genes in VlJneo (with Kpn I). 

15 C. VlJns-tPA: 

In order to provide an heterologous leader peptide sequence 
to secreted and/or membrane proteins, VlJn was modified to include the 
human tissue-specific plasminogen activator (tPA) leader. Two 
synthetic complementary oligomers were annealed and then ligated into 

20 VlJn which had been Bgin digested. The sense and antisense oligomers 
were 5'-GATC ACC ATG G AT GCA ATG AAG AGA GGG CTC TGC 
TGT GTG CTG CTG CTG TGT GGA GCA GTC TTC GTT TCG CCC 
AGC GA-3' (SEQ.ID:2), and 5'-GAT CTC GCT GGG CGA AAC GAA 
GAC TGC TCC ACA CAG CAG CAG CAC ACA GCA GAG CCC 

25 TCT CTT CAT TGC ATC CAT GGT-3' (SEQ. ID:3). The Kozak 
sequence is underlined in the sense oligomer. These oligomers have 
overhanging bases compatible for ligation to Bglll-cleaved sequences. 
After ligation the upstream BglD site is destroyed while the downstream 
Bgin is retained for subsequent ligations. Both the junction sites as well 

30 as the entire tPA leader sequence were verified by DNA sequencing. 
Additionally, in order to conform with our consensus optimized vector 
VlJns (=VlJneo with an Sfil site), an Sfil restriction site was placed at 
the Kpnl site within the BGH terminator region of VI Jn-tPA by 
blunting the Kpnl site with T4 DNA polymerase followed by ligation 
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with an Sfil linker (catalogue #1 138, New England Biolabs). This 
modification was verified by restriction digestion and agarose gel 
electrophoresis. 

5 EXAMPLE 5 

I. HIV env Vaccine Cnnstnirts' 

Vaccines Producing Secreted gwv-d erived Antiyen (gpl20 and fpUnv 
Expression of the rev -dependent env gene as gpl20 was 
conducted as follows: gpl20 was PCR-cloned from the MN strain of 

10 HIV with either the native leader peptide sequence (VIJns-gpl20), or as 
a fusion with the tissue-plasminogen activator (tPA) leader peptide 
replacing the native leader peptide (VlJns-tPA-gpl20). tPA-gpl20 
expression has been shown to be rev-independent [B.S. Chapman e t al, 
Nuc. Acids Res. 19. 3979 (1991); it should be noted that other leader 

15 sequences would provide a similar function in rendering the gpl20 gene 
rev independent]. This was accomplished by preparing the following 
gpl 20 constructs utilizing the above described vectors. 

EXAMPLES 

20 gpl 20 Vaccine Constructs: 

A. VlJns-tPA-HTVj^pp|2Q- 

HIVmN gpl 20 gene (Medimmune) was PGR amplified 
using oligomers designed to remove the first 30 amino acids of the 
peptide leader sequence and to facihtate cloning into VI Jns-tPA creating 

25 a chimeric protein consisting of the tPA leader peptide followed by the 
remaining gpl 20 sequence following amino acid residue 30. This 
design allows for rev -independent gpl 20 expression and secretion of 
soluble gpl20 from cells harboring this plasmid. The sense and 
antisense PCR oligomers used were 5'-CCC CGG ATC CTG ATC ACA 

30 GAA AAA TTG TGGGTC ACA GTC-3' (SEQ. ID:4). and 5'-C CCC 
AGG AATC CAC CTG HA GCG CTT TTC TCT CIG CAC CAC 
TCT TCT C-3' (SEQ. ID:5). The translation stop codon is underlined. 
These oligomers contain BamHI restriction enzyme sites at either end of 
the translation open reading frame with a Bell site located 3' to the 
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BamHI of the sense oligomer. The PGR product was sequentially 
digested with Bell followed by BamHI and hgatcd into VlJns-tPA which 
had been Bgin digested followed by calf intestinal alkaline phosphatase 
treatment. The resulting vector was sequenced to confirm in-frame 
5 fusion between the tPA leader and gp 120 coding sequence, and gp 120 
expression and secretion was verified by immunoblot analysis of 
transfected RB cells. 

B. VUns-tPA-HTY rnp gpl20 : 

10 This vector is analogous to LA. except that the HIV IIIB 

strain was used for gpl20 sequence. The sense and antisense PGR 
oligomers used were: 5'-GGT AGA TGA TGA GA GAA AAA TTG 
TOG GTG AGA GTG-3' (SEQ.ID:6), and 5'-GGA GAT TGA TGA 
GAT ATG TTA TGT TTT TTG TGT GTG GAG GAG TGT TG-S' 

15 CSEQ.ID:7), respectively. These oligomers provide Bell sites at either 
end of the insert as well as an EcoRV just upstream of the BcII site at 
the 3'-end. The 5'-terminal Bell site allows ligation into the Bglll site 
of VlJns-tPA to create a chimeric tPA-gpl20 gene encoding the tPA 
leader sequence and gpl20 without its native leader sequence. Ligation 

20 products were verified by restriction digestion and DNA sequencing. 

EXAMPLE 7 

£0140 Vaccine ConstnictR! 

These constructs was prepared by PGR similarly as tPA- 
25 gp 1 20 with the tPA leader in place of the native leader, but designed to 

produce secreted antigen by terminating the gene immediately NH2- 

terminal of the transmembrane peptide (projected carboxyterminal 

amino acid sequence = NH2- TNWLWYIK-GOOH) [SEQ.ID:8]. 

Unlike the gpl20-producing constracts, gpl40 constructs should 
30 produce oligomenc antigen and retain known gp41 -contained antibody 

neutralization epitopes such as ELDKWA (SEQ.ID:53) defined by the 

2F5 monoclonal antibody. 

Gonstructs were prepared in two forms (A or B) depending 

upon whether the gpl60 proteolytic cleavage sites at the junction of 
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gpl20 and gp41 were retained (B) or eliminated (A) by appropriate 
amino acid substitutions as described by Kieny et al., (Prot. Eng. 2: 
219-255 (1988)) (wild type sequence = NH2- 

...KAKRRWQREKR...COOH (SEQ.ID:9) and the mutated sequence = 
5 NH2-...KAQIiHWQNEHQ...CCK)H (SEQ.ID: 10) with mutated amino 
acids underiined). 

A. VlJns-tPA-gDl4Q/miitRRR-A/SRV-l T-IITR r based nn HTV- 

10 This constract was obtained by PCR using the following 

sense and antisense PCR oligomers: SI-CT GAA AGA CCA OCA ACT 
CCT AGO GAAT TTG GGG TTG CTC TGG-3' (SEQ.ID:11) :. and 5*- 
CGC AGG GGA GGT GGT CTA GAT ATC TTA TTA TTT TAT 
ATA CCA CAG CCA ATT TGT TAT G-3* (SEQ ID: 12) to obtain an 

15 Avrll/EcoRV segment from vector IVB (containing the optimized RRE- 
A segment). The 3'-UTR, prepared as a synthetic gene segment, that is 
derived from the Simian Retrovirus- 1 (SRV-1, see below) was inserted 
into an Srfl restriction enzyme site introduced immediately 3 - of the 
gpl40 open reading frame. This UTR sequence has been described 

20 previously as facilitating rev-independent expression of HIV env and 
gag- 

B. VlJns-tPA-tml40/m»tRRF..R/ SRV.l 3'-lJTR rhaseri nn HTV- 

25 This construct is similar to IIA except that the env 

proteolytic cleavage sites have been retained by using constmct JVC as 
starting material. 

C. VlJns-tPA-gnl40/ont3n-A fha>:ed nn HTV-IttjpV 

30 This construct was derived from IVB by Avrll and Srfl 

restriction enzyme digestion followed by ligation of a synthetic DNA 
segment corresponding to gp30 but comprised of optimal codons for 
translation (see gp32-opt below). The gp30-opt DNA was obtained 
from gp32-opt by PCR amphfication using the following sense and anti- 
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10 



sense oligomers: 5 -GGT ACA CCT AGG CAT CTG GGG CTG CTC 
TGG-3', (SEQ ID: 13) and. 5'-CCA CAT GAT ATC G CCC GGG C 
TTA TTA TTT GAT GTA CCA CAG CCA GTT GOT GAT G-3', 
(SEQ n):14), respectively. This DNA segment was digested with Avrll 
and EcoRV restriction enzymes and ligated into VlJns-tPA- 
gpl43/opt32-A (IVD) that had been digested with Avrll and Srfl to 
remove the corresponding DNA segment. The resulting products were 
verified by DNA sequencing of ligation junctions and immunoblot 
analysis. 

D. VlJns-tPA-gDl40/ont30-R (ba sed on HTV-1 JJT^V 

This construct is similar to IIC except that the env 
proteolytic cleavage sites have been retained. 

15 E. VlJns-tPA-gnUn/npfall.A' 

The env gene of this construct is comprised completely of 
optimal codons. The constant regions (CI , C5, gp32) are those 
described in rVB,D,H with an additional synthetic DNA segment 
corresponding to variable regions 1-5 is inserted using a synthetic DNA 

20 segment comprised of optimal codons for translation (see example 
below based on HIV-1 MN VI -V5). 

F. VlJns-tPA-gnl40/optall-R- 

This construct is similar to IIE except that the env 
25 proteolytic cleavage sites have been retained. 

G. VlJns-tPA-gDl40/QDt all-A rn on-inB strainsV 

This construct is similar to IIE above except that env amino 
acid sequences from strains other than IIIB are used to determine 
30 optimum codon usage throughout the variable (VI -V5) regions. 

H. VlJns-tPA-gDl40/opt all-R (non-IlTB strainsV 

This construct is similar to IIG except that the env 
proteolytic cleavage sites have been retained. 
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EXAMPLR « 

gDl60 Vaccine rnnstnirtf^- 

Constructs were prepared in two forms (A or B) depending 
upon whether the gpl60 proteolytic cleavage sites as described above 

5 

A. VlJns-rtfv/g/iv; 

This vector is a variation of the one described in section D 
above except that the entire tat coding region in exon 1 is deleted up to 
the beginning of the rev open reading frame. VI Jns-gpl60iiiB (see 

10 section A. above) was digested with PstI and Kpnl restriction enzymes 
to remove the 5'-region of the gpl60 gene. PGR amplification was used 
to obtain a DNA segment encoding the first REV exon up to the Kpnl 
site in gpl60 from the HXB2 genomic clone. The sense and antisense 
PGR oligomers were 5'-GGT ACA CTG GAG TGA GGG TGC T ATG 

15 GGA GGA AGA AGG GGA GAG-3' (SEQ.ID:15) and 5'-GGA GAT 
GA GGT AGG GGA TAA TAG AGT GTG AGG-3' (SEQ.ID:16) 
respectively. These oligomers provide PstI and Kpnl restriction 
enzyme sites at the 5'- and 3 - terauni of the DNA fragment, 
respectively. The resulting DNA was digested with PstI and Kpnl, 

20 purified from an agarose electrophoretic gel, and ligated with VlJns- 
gpl60(PstI/KpnI). The resulting plasmid was verified by restriction 
enzyme digestion. 

B. VUns-yplfiO; 

25 HlVinb gpl 60 was cloned by PGR amphfication from 

plasmid pF412 which contains the 3'-tenninal half of the HlVmb 
genome derived from HlVnib clone HXB2. The PGR sense and 
antisense oligomers were 5'-GGT AGA TGA TGA ACC ATG AGA 
GTG AAG GAG AAA TAT GAG G-S* (SEQ. ID: 17). and 5 -GGA GAT 

30 TGA TGA GAT ATG GGG ATG HA TAG GAA AAT GGT TTG G-3' 
(SEQ. ID: 1 8), respectively. The Kozak sequence and translation stop 
codon are underiined. These oligomers provide Bell restriction enzyme 
sites outside of the translation open reading frame at both ends of the 
env gene. (Bcll-digested sites are compatible for ligation with Bglll- 
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digested sites with subsequent loss of sensitivity to both restriction 
enzymes. Bell was chosen for PCR-cloning gpl60 because this gene 
contains internal Bgin and as well as BamHI sites). The antisense 
oligomer also inserts an EcoRV site just prior to the Bell site as 
5 described above for other PCR-derived genes. The amplified gpl60 
gene was agarose gel-purified, digested with Bell, and ligated to VlJns 
which had been digested with Bgin and treated with calf intestinal 
alkaline phosphatase. The cloned gene was about 2.6 kb in size and each 
junction of gpl60 with VlJns was confirmed by DNA sequencing. 

10 

C. VlJns-tPA-gnl60 (based on HTV-lnigi^- 

This vector is similar to Example 1(C) above, except that 
the fiill-length gpl60, without the native leader sequence, was obtained 
by PGR. The sense oligomer was the same as used in I.C. and the 

1 5 antisense oligomer was 5'-CCA CAT TG A TCA GAT ATC CCC ATC 
TTA TAG CAA AAT CCT TTC C-3' (SEQ.ID:19). These oligomers 
provide Bell sites at either end of the insert as well as an EcoRV just 
upstream of the Bell site at the 3'-end. The S'-tenminal Bell site allows 
Hgation into the Bgin site of VlJns-tPA to create a chimeric tPA-gpl60 

20 gene encoding the tPA leader sequence and gpl60 without its native 
leader sequence. Ligation products were verified by restriction 
digestion and DNA sequencing. 

D. VUns-tPA-gDl60/ODt Cl/ont41-A fbased nn HTV-ljn^^- 
25 This constmct was based on IVH, having a complete 

optimized codon segment for C5 and gp41 , rather than gp32, with an 
additional optimized codon segment (see below) replacing CI at the 
amino terminus of gpl20 following the tPA leader. The new CI 
segment was joined to the remaining gpl43 segment via SOE PCR using 
30 the following ohgomers for PCR to synthesize the joined CI /1 43 

segment: 5 -CCT GTG TGT GAG TTT AAA C TGC ACT GAT TTG 
AAG AAT GAT ACT AAT AC-3' (SEQ ID:20). The resulting gpl43 
gene contains optimal codon usage except for VI -V5 regions and has a 
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unique Pmel restriction enzyme site placed at the junction of CI and VI 
for insertion of variable regions from other HIV genes. 

E. VlJns-tPA-gnl60/om ri/npfAKB rhased nn HTV-^ijyp). 
5 This construct is similar to HID except that the env 

proteolytic cleavage sites have been retained. 

F VlJns-tPA.gnl60/ont all-A (ha tsed on HTV-lyj jpV 

The env gene of this construct is comprised completely of 
10 optimal codons as described above. The constant regions (CI, C5, 
gp32) are those described in niD,E which is used as a cassette 
(employed for all completely optimized gpl60s) while the variable 
regions, VI -V5, are derived from a synthetic DNA segment comprised 
of optimal codons. 

15 

G. VUns-tPA-gplfin/nptflV-p- 

This constract is similar to IIIF except that the env 
proteolytic cleavage sites have been retained. 

20 H. VlJns-tPA-Pnl60/ont all-A (nn n-inB stn.in5;V 

This constmct is similar to IIIF above except that env 
amino acid sequences from strains other than IDE were used to 
determine optimum codon useage throughout the variable (V1-V5) 
regions. 

25 

I- VUns-tPA-gDl6Q/oDt all-R fnnn-IIIB .strains^- 

This construct is similar to IIIH except that the env 
proteolytic cleavage sites have been retained. 
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EXAMPLR 9 

gDl43 Vaccine Constnictsr 

These constructs were prepared by PGR similarly as other 
tPA-containing constructs described above (tPA-gpl20, tPA-gpl40, and 
5 tPA-gp 1 60), with the tPA leader in place of the native leader, but 
designed to produce COOH-termihated, membrane-bound env 
(projected intracellular amino acid sequence= NH2-NRVRQGYSP- 
COOH). This construct was designed with the purpose of combining the 
increased expression of env accompanying tPA introduction and 

10 minimizing the possibility that a transcript or peptide region 

corresponding to the intracellular portion of env might negatively 
impact expression or protein stability/transport to the cell surface. 
Constructs were prepared in two forms (A or B) depending upon 
whether the gpl60 proteolytic cleavage sites were removed or retained 

15 as described above. The residual gp41 fragment resulting from 
tmncation to gpl43 is referred to as gp32. 

A. VlJns-tPA.ypl43r 

This constract was prepared by PGR using plasmid pF412 

20 with the following sense and antisense PGR oligomers: 5'-GGT AGA 
TGA TGA GA GAA AAA TTG TGG GTG AGA GTG-3' (SEQ.ID:21):, 
and 5'- GGA GAT TGA TGA G GGG GGG G TTA GGG TGA ATA 
GCG GTG GGT GAG TGT GTT GAG-3' (SEQ.ID:22). The resulting 
DN A segment contains Bell restriction sites at either end for cloning 

25 into VlJns-tPA/Bglll-digested with an Srfl site located immediately 3'- 
to the env open reading frame. Gonstructs were verified by DNA 
sequencing of ligation junctions and immunoblot analysis of transfected 
cells (Figure 8). 

30 B. VlJns-tPA.gnl43/mi.tRRF-A- 

This construct was based on IVA by excising the DNA 
segment using the unique Muni restriction enzyme site and the 
downstream Srfl site described above. This segment corresponds to a 
portion of the gpl20 G5 domain and the entirety of gp32. A synthetic 

-45- 



wo 97/48370 



PCT/US97/10517 



DNA segment corresponding to -350 bp of the rev response element 
(RRE A) of gpl60, comprised of optimal codons for translation, was 
joined to the remaining gp32 segment by splice overlap extension (SOE) 
PGR creating an Avrll restriction enzyme site at the junction of the two 
5 segments G)ut no changes in amino acid sequence). These PGR reactions 
were performed using the following sense and antisense PGR oligomers 
for generating the gp32-containing domain: 5'-CT GAA AGA GGA 
GGA AGT GGT AGG GAT TTG GGG TTG GTG TCG-3' (SEQ ID:23) 
and 5 -CGA GAT TGA TGA G GGG GGG G TTA GGG TGA ATA 

10 GGG GTG GGT GAG TGT GTT GAG-3' [SEQ ID:24] (which was used 
as the antisense oligomer for IVA), respectively. The mutated RRE 
(mutRRE-A) segment was joined to the wild type sequence of gp32 by 
SOE PGR using the following sense oligomer, 5'-GGT AGA GAA TTG 
GAG GAG GGA GTT ATA TAA ATA TAA G-3' (SEQ ID:25), and 

15 the antisense oligomer used to make the gp32 segment. The resulting 
joined DNA Segment was digested with Muni and Srfl restriction 
enzymes and ligated into the parent gpl43/MunI/Srfl digested plasmid. 
The resulting constmct was verified by DNA sequencing of ligation and 
SOE PGR junctions and immunoblot analysis of transfected cells (Figure 

20 8). 

G. VlJns-tPA-gnl43/mutRRR-R- 

This constmct is similar to IVB except that the env 
proteolytic cleavage sites have been retained by using the mutRRE-B 
25 synthetic gene segment in place of mutRRE-A. 

D. VlJns.tPA-ppl4^/npn9.^. 

This constmct was derived from IVB by Avrll and Srfl 
restriction enzyme digestion followed by ligation of a synthetic DNA 
30 segment corresponding to gp32 but comprised of optimal codons for 
translation (see gp32 opt below). The resulting products were verified 
by DNA sequencing of ligation junctions and immunoblot analysis. 
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E. VlJns-tPA-fp14^/npn9-R- 

This construct is similar to IVD except that the env 
proteolytic cleavage sites have been retained by using IVC as the initial 
plasmid. 

5 

F. VlJns-tPA-yp U3/SRV-l :^'-TJTR t 

This construct is similar to IVA except that the 3'-UTR 
derived from the Simian Retrovirus- 1 (SRV-1, see below) was inserted 
into the Srfl restriction enzyme site introduced immediately 3 - of the 
10 gpl43 open reading frame. This UTR sequence has been described 
previously as facilitating rev-independent expression of HIV env and 
gag. 

G. VlJns-tPA-gn143/Qm ri/npn7A> 

15 This constmct was based on IVD, having a complete 

optimized codon segment for C5 and gp32 with an additional optimized 
codon segment (see below) replacing CI at the amino terminus of gpI20 
following the tPA leader. The new CI segment was joined to the 
remaining gpl43 segment via SOE PCR using the following oligomers 

20 for PCR to synthesize the joined CI /1 43 segment: 5 -CCT GTG TGT 
GAG TTT AAA C TGC ACT GAT TTG AAG AAT GAT ACT AAT 
AC-3* (SEQ ID:26). The resulting gpl43 gene contains optimal codon 
useage except for VI -V5 regions and has a unique Pmel restriction 
enzyme site placed at the junction of CI and VI for insertion of variable 

25 regions from other HIV genes. 

H. VlJns-tPA-ppl43/opt CI /npt^9R- 

This construct is similar to IVH except that the env 
proteolytic cleavage sites have been retained. 

30 
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I. VlJns-tPA-ppl43/optall.A: 

The env gene of this construct is comprised completely of 
optimal codons. The constant regions (CI , C5, gp32) are those 
described in 4B,D,H with an additional synthetic DNA segment 
5 corresponding to variable regions VI -V5 is inserted using a synthetic 
DNA segment comprised of optimal codons for translation. 

J. VlJns-tPA-gnU3/opt all-Rt 

This construct is similar to IVJ except that the env 
10 proteolytic cleavage sites have been retained. 

K. VlJns-tPA-gDl43/oDt all-A fnon-inB straimV 

This construct is similar to IIIG above except that env 
amino acid sequences from strains other than IIIB were used to 
15 determine optimum codon useage throughout the variable (VI -V5) 
regions. 

L. VlJns-tPA-gDl43/ont all-B (non-ITIB strainsV 

This construct is similar to IIIG above except that env 
20 amino acid sequences from strains other than IBB were used to 
determine optimum codon useage throughout the variable (V1-V5) 
regions. 

EXAMPLE 10 

25 gpl43/glvB Vaccine Constmcts : 

These constructs were prepared by PGR similariy as other 
tPA-containing constructs described above (tPA-gpl20, tPA-gpl40, 
tPA-gpl43 and tPA-gpl60), with the tPA leader in place of the native 
leader, but designed to produce COOH -terminated, membrane-bound 

30 env as with gpl43. However, gpl43/glyB constmcts differ from gpl43 
in that of the six amino acids projected to comprise the intracellular 
peptide domain, the last 4 are the same those at the carboxyl terminus of 
human glycophorin B (glyB) protein (projected intracellular amino acid 
sequence= NH2-NELIKA-COOH (SEQ.ID:27) with the underlined 

35 residues corresponding to glyB and "R" common to both env and glyB). 
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This construct was designed with the purpose gaining additional env 
expression and directed targeting to the cell surface by completely 
eliminating any transcript or peptide region corresponding to the 
intracellular portion of env that might negatively impact expression or 
5 protein stability/transport to the cell surface by replacing this region 
with a peptide sequence from an abundantly expressed protein (glyB) 
having a short cytoplasmic domain (intracellular amino acid sequence= 
NH2-RRLIKA-COOH). Constructs were prepared in two forms (A or 
B) depending upon whether the gpl60 proteolytic cleavage sites were 
10 removed or retained as described above. 

A. VlJns-tPA-Pnl43/ont32-A/plvRr 

This construct is the same as IVD except that the following 
antisense PGR oligomer was used to replace the intracellular peptide 
15 domain of gpl43 with that of glycophorin B as described above: 5*- 
CCA CAT GAT ATC G CCC GGG C TTA TTA GGC CTT GAT CAG 
CCG GTT CAC AAT GGA CAG CAC AGC-3' (SEQ ID:28). 

B. VlJns.tPA-Pnl43/ont32-R/plvR- 

20 This constmct is similar to VA except that the env 

proteolytic cleavage sites have been retained. 

C. VlJns-tPA-gnl43/ont CI /npt39-A/plyR- 

This construct is the same as VA except that the first 
25 constant region (CI) of gpl20 is replaced by optimal codons for 
translation as with IVH. 

D. VUns-tPA-gDl43/oDtCl/opn2-R/plvR- 

This construct is similar to VC except that the env 
30 proteolytic cleavage sites have been retained. 

E. VlJns-tPA-gDl43/ontall-A/plvR> 

The env gene of this construct is comprised completely of 
optimal codons as described above. 
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F. VlJns-tPA-gn143/ont all-R/f lyRr 

This construct is similar to VE except that the env 
proteolytic cleavage sites have been retained. 

5 G. VlJns-tPA-gDl43/opt all-A/glvB (non-ITm .fra»n.^. 

This construct is similar to IIIG above except that env 
amino acid sequences from strains other than IIIB were used to 
determine optimum codon useage throughout the variable (V1-V5) 
regions. 

10 

H. VlJns-tPA-gDl43/oDt all-B/glv B (non-inR strainO - 

This constract is similar to VG except that the env 
proteolytic cleavage sites have been retained. 

15 HIV env Vaccine Constructs with Variable Loop Deletions ; 

These constructs may include all env forms listed above 
(gpl20, gpl40, gpl43, gpl60, gpl43/gIyB) but have had variable loops 
within the gpl20 region deleted during preparation (e.g., VI, V2, 
and/or V3). The purpose of these modifications is to eliminate peptide 

20 segments which may occlude exposure of conserved neutralization 
epitopes such as the CD4 binding site. For example, the following 
oligomer was used in a PCR reaction to create a V1/V2 deletion 
resulting in adjoining THE CI and C2 segments: 5'-CTG ACC CCC 
CTG TGT GTG GGG GCT GGC AGT TGT AAC ACC TCA GTC 

25 ATT ACA CAG-3' (SEQ ID:29). 

EXAMPLE 1 1 

Design of Synthetic Gene Segments for Increased env Gene Rxpmssinn ' 
Gene segments were converted to sequences having 
30 identical translated sequences (except where noted) but with alternative 
codon usage as defined by R. Lathe in a research article from J. Molec. 
Biol. Vol. 183, pp. 1-12 (1985) entitled "Synthetic Oligonucleotide 
Probes Deduced from Amino Acid Sequence Data: Theoretical and 
Practical Considerations". The methodology described below to 
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increase rev-independent expression of HIV env gene segments was 
based on our hypothesis that the known inability to express this gene 
efficiently in mammalian cells is a consequence of the overall transcript 
composition. Thus, using alternative codons encoding the same protein 
5 sequence may remove the constraints on env expression in the absence 
of rev. Inspection of the codon usage within env revealed that a high 
percentage of codons were among those infrequently used by highly 
expressed human genes. The specific codon replacement method 
employed may be described as follows employing data from Lathe et al.: 

10 

1 . Identify placement of codons for proper open 
reading frame. 

2. Compare wild type codon for observed frequency of 
use by human genes (refer to Table 3 in Lathe et al.). 

15 3. If codon is not the most commonly employed, replace 

it with an optimal codon for high expression based on data in Table 5. 

4. Inspect the third nucleotide of the new codon and the 
first nucleotide of the adjacent codon immediately 3*- of the first. If a 
5'-CG-3' pairing has been created by the new codon selection, replace it 

20 widi the choice indicated in Table 5. 

5. Repeat this procedure until the entire gene segment 
has been replaced. 

6. Inspect new gene sequence for undesired sequences 
generated by these codon replacements (e.g., "ATTTA" sequences, 

25 inadvertent creation of intron splice recognition sites, unwanted 

restriction enzyme sites, etc.) and substimte codons that eliminate these 
sequences. 

7. Assemble synthetic gene segments and test for 
improved expression. 

30 
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These methods were used to create the following synthetic gene 
segments for HIV env creating a gene comprised entirely of optimal 
codon usage for expression: (i) gpl20-Cl (opt); (ii) V1-V5 (opt); (iii) 
RRE-A/B (mut or opt); and (iv) gp30 (opt) with percentages of codon 
5 replacements/nucleotide substitutions of 56/19, 73/26, 78/28, and 61/25 
obtained for each segment, respectively. Each of these segments has 
been described in detail above with actual sequences listed below. 

fl&miCj (film 

10 This is a gpl 20 constant region 1 (CI) gene segment from the mature 
N-terminus to the beginning of VI designed to have optimal codon 
usage for expression. 

^ ^ 1 TGATCACAGA GAAGCTGTGG GTG ACAGTGT ATTATGGCGT CCTAGTCTGG 

5 1 AAGGAGGCCA (XACTACCCT GTTCTGTGCC TCTG ATGCCA AGGCCTATGA 
101 CACAGAGGTG CACAATGTGT GGGCCACCCA TGCCTGTGTG CCCACAG ACC 

20 151 CCAACCXXrAGGAGGTGGTGCTGGTGAATGTGACTGAGAACTTCAACATG 
201 TGGAAGAACA ACATGGTGGA GCAGATGCAT GAGGACATCA TCAGCCTGTC 

251 GGACXAGAGCCTGAAGCCCTGTGTGAAGCTGACCCCCCTGTGTCTCAGTT 
301 TAAAC (SEQ 1D:30) 

MN V1-VS fopn 

30 This is a gene segment corresponding to the derived protein sequence 
for HIV MN V1.V5 (1066BP) having optimal codon usage for 
expression. 

1 AGrrrTAAACTQC^«SAGftOCTGAQGAACACCAC^ 
51 AGCCAAC^AACTOCAACTaX3*QQQC>VCGATCAAG^ 
101 AGAACTGCTCCTTCAACATCACXVVCXTaDATCAQQGyVC^ 
40 151 QAGTATQCXXJTGCTOTACAAQCTOQACAnGTOT^ 
201 CACCTCCTACAGGCTGyVTCTCCTOCAACACC^^ 
»1 GCCCCAAAATCnOCTrrGAGCCCATaXX)ATCX:ACT^ 
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301 QQCTrrOCCATOCTOAAGTOCWGACA^ 
351 CnnSCAAGAATGTGTCCACWSTGCAGrQC^ 
5 401 TCTCCVmiAGCTGCTOCTCMTO^^ 

451 ATCAGGTCTG/«3AACTTCACAGftCMTQCCAAGAC^^ 
501 GAATGAGTGTGTCCAGATCAACTtaCWX^ 

10 

551 fiGfiGtSAJOCACATTGiaXCn^GG^^ 
eOl ATTGGCAa^ATCAGCaCAGCaCCXVOQCAACAT^^ 
15 651 TGMMXCTGfiGOCAGATTGTGJ^^ 

701 AGACCATTGI Gl ICAACCABTCCnxn^GmSGGGftCCCTOAGATTGTC^^ 
751 CACTCXnrCVVACnXSTOQCaCSQQQAGTrCTrCTAC^^ 

20 

801 GnCAACTCCACX:TGGAATCQCAACAAC>VCCTQQAACAACA(X^^ 
851 CCAACAACAACATC/mnCCAG[TGCyW3Al^^ 
25 901 TOGC^GGAGGTOOQCAAQGCCATSTATQ^ 

951 CAQGnPOCTOCTDCAACATCACAQODCTGCTGCntSACC^ 
1X1 AQGACACAGACACXV^ACX3^ACXXSAAATCT 

30 

1051 ATGAGGGACAATTGG (SEQID31) 

RRg.Mut tt\ 

35 This is a DNA segment corresponding to the rev response element 

(RRE) of HIV-1 comprised of optimal codon usage for expression. The 
"A" form also has removed the known proteolytic cleavage sites at the 
gpl20/gp41 junction by using the nucleotides indicated in boldface. 

40 1 GACAATrQQAQQAQCQAGTTATATAAATATAAQCTGGTCWVAGATTl^^ 
51 CXnGGGGGTGGCCCCAACAAAAG CTCflBAitfJCfla ^ 

101 ACOafiQCXXSTQQGCATTQQGGCCXTOnTCTQQQCTTTCTQt^^ 

45 

151 QQCTCX:ACAATQQGCQCa»TAQCA7GACCCrrcACC^^ 
201 QCTOCTOACTQGCATDGTTOAQCAQCVy^ 
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251 AAQCXX:A(3CAQCA0CTanCCAQCTGACTCTC^^ 
301 CAQQCX}OQGGTOCTOGCOCT(XWQCXrTATCTGA^ 
5 351 AGGC (SEQID:32) 

RRE.Mut im 

This is a DNA segment corresponding to the rev response element 
(RRE) of HIV-1 comprised of optimal codon usage for expression. The 
10 "B" form retains the known proteolytic cleavage sites at the gpl20/gp41 
junction. 

1 GACAATTGGAGGAGCGAGTTATATAAATATAAQGTQGTCAA^ 
15 51 CCrrGDGGSTGGCOOCM^A 

101 AGafidQCCGTGGGCATTCK»QCXXnX3TTTC 
151 QQCTtXJACMTOQGOGCOQCTAQCVkTOACX^CTCA^^ 
201 GCTOCTGAGTQGCATtXSrrXAQCyyQC^^ 
251 AAQCCCAGCAQCACXrrOCTCCAQCnG^ 
25 301 CAGGCCOQQGTGOQGOOCTCX3AGCX3CTATCT^^ 
351 AGGC (SEQID:33) 

OP32 /opt^ 

30 

This is a gp32 gene segment from the Avrll site (starting immediately at 
the end of the RRE) to the end of gp 1 43 comprised of optimal codons 
for expression. 

35 1 CCTAQQCATCTQQQGOQCrCTtjQC>\AGC^^ 
51 GCCCTCGWGCCTCCTOGTOCAACAAGAGa^^ 
101 ACATQACCTQGATQGAQTQGGACAGAGAGATrVUW:;AA^^ 

40 

151 ATCCACTCOC7GATTGAGGAGrOCCAGI\ACCAQCAQGAGAAGAATG 
201 GQAGCTQCTGGASCTOGACAAGraOQCXnCCC^^ 
45 251 TCACCAACTOGCTGTGQTACATCAAAATCT7GATCATGATTGTQQGQQGC 
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XI CTQGrrGGQQCTQCXaGAI lUI Ul I lUUIGIGCTGTCCATreTCMCCXaQCT 
351 GAGACAGGQCTACTCCXXXn-MTAAQCCCGGGCGATATC (SEQID34) 
5 SRV-I CTE fA^ 

This is a synthetic gene segment corresponding to a 3'-UTR from the 
Simian Retroviras-1 genome. This DNA is placed in the following 
orientation at the 3'-tenninus of HIV genes to increase rev-independent 
10 expression. 

Srfl EooRV 
S'GOOCGGGCfifllM^TAGACCACCTCCCCTGCGAGCTMGC^^ 

15 QCXSAATCAOaQCSrAAQAGAGTtS^TT^ 

QOOGTCAGWGCTACrrQCXnAATDCAAAGACGG^^ 

mATATATATTTAAAAaQGTO«X:TCTOCQQAQ0CX3^ 



20 



25 



ATGTCTTGGGATATC GCCCCSGGC a- (SEQID.-35) 
BooRV Srfl 
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10 



SRV.1 CTE (B\ 

This synthetic gene segment is identical to SRV-1 CTE (A) shown above 
except that a single nucleotide mutation was used (indicated by boldface) 
to eliminate an ATTTA sequence. This sequence has been associated 
with increased mRNA turnover. 

Srfl BaRV 
5'-fiCCC££fiC(a«amTAGACCyVCXnCCCXnaX3^^ 



GaX3TCAGAGCTACTOCX:TAATa}AAAG^ 

15 TATCACTOAAOCTAAG^GGOGP^GCTTOCG^ 

mATATATATTMAMOOGTC^CCTCTOCQG^VGDOGn^ 

ATGTCTTQQfifil^ GCCCQGGC a' (SEQID.-36) 
20 BdoRV Srfl 

EXAMPLE 1 1 

In Vitro ppl20 Vaccine Expression: 

In vitro expression was tested in transfected human 
25 rhabdomyosarcoma (RD) cells for these constructs. Quantitation of 
secreted tPA-gpl 20 from transfected RD cells showed that V 1 Jns-tPA- 
gpl20 vector produced secreted gpl20. 

In Vivo gDl20 Vaccination; 

30 VUns-tPA-gDl20MN PNV-induced Class TT MHC- 

restricted T Ivmnhocvte gpl2Q specific antigen reactivities. Balb/cmice 
which had been vaccinated two times with 200 pg VlJns-tPA-gpl20MN 
were sacrificed and their spleens extracted for in vitro determinations of 
helper T lymphocyte reactivities to recombinant gpl20. T cell 

35 proliferation assays were performed with PBMC (peripheral blood 
mononuclear cells) using recombinant gpl20inB (RepUgen, catalogue 
#RP1016-20) at 5 pg/ml with 4 x 105 cells/ml. Basal levels of 3H- 
thymidine uptake by these cells were obtained by culturing the cells in 
media alone, while maximum proliferation was induced using ConA 
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Stimulation at 2 pg/ml. ConA-induced reactivities peak at ~3 days and 
were harvested at that time point with media control samples while 
antigen-treated samples were harvested at 5 days with an additional 
media control. Vaccinated mice responses were compared with naive, 
5 age-matched syngeneic mice. ConA positive controls gave very high 
proliferation for both naive and immunized mice as expected. Very 
strong helper T cell memory responses were obtained by gpl20 
treatment in vaccinated mice while the naive mice did not respond (the 
threshold for specific reactivity is an stimulation index (SI) of >3-4; SI 

10 is calculated as the ratio of sample cpm/media cpm). Si's of 65 and 14 
were obtained for the vaccinated mice which compares with anti-gpl20 
ELISA titers of 5643 and 1 1,9(X), respectively, for these mice. 
Interestingly, for these two mice the higher responder for antibody gave 
significantly lower T cell reactivity than the mouse having the lower 

15 antibody titer. This experiment demonstrates that the secreted gp 120 
vector efficiently activates helper T cells in vivo as well as generates 
strong antibody responses. In addition, each of these immune responses 
was detemiined using antigen which was heterologous compared to that 
encoded by the inoculation PNV (UIB vs. MN): 

20 

EXAMPLE 12 

gp 160 Vaccines 

In addition to secreted gpl20 constructs, we have prepared 
expression constructs for full-length, membrane-bound gpl60. The 

25 rationales for a gpl60 construct, in addition to gpl20, are (1) more 
epitopes are available both for both CTL stimulation as well as 
neutralizing antibody production including gp41, against which a potent 
HIV neutralizing monoclonal antibody (2F5, see above) is directed; (2) a 
more native protein stracture may be obtained relative to virus- 

30 produced gpl60; and, (3) the success of membrane-bound influenza HA 
constructs for immunogenicity [Ulmer et al.. Science 252:1745-1749, 
1993; Montgomery, D., et al.. DNA and C^ll Biol, ] 7-777.7R^ 1993]. 
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gpl60 retains substantial rev dependence even with a heterologous 
leader peptide sequence so that further constructs were made to increase 
expression in the absence of rev. 

5 EXAMPLE n 

Assay For HTV rvtnf oxic T-LvmphnrvfPQ- 

The methods described in this section illustrate the assay as 
used for vaccinated mice. An essentially similar assay can be used with 
primates except that autologous B cell lines must be established for use 
10 as target cells for each animal. Hiis can be accomplished for humans 
using the Epstein-Barr vims and for rhesus monkey using the herpes B 
virus. 

Peripheral blood mononuclear cells (PBMC) are derived 
from either freshly drawn blood or spleen using Ficoll-Hypaque 

15 centriftigation to separate erythrocytes from white blood cells. For 
mice, lymph nodes may be used as well. Effecter CTLs may be 
prepared from the PBMC either by in vitro culture in IL-2 (20 U/ml) 
and concanavalin A (2pg/ml) for 6-12 days or by using specific antigen 
using an equal number of irradiated antigen presenting cells. Specific 

20 antigen can consist of either synthetic peptides (9-15 amino acids 
usually) that are known epitopes for CTL recognition for the MHC 
haplotype of the animals used, or vaccinia vims constructs engineered to 
express appropriate antigen. Target cells may be either syngeneic or 
MHC haplotype-matched cell lines which have been treated to present 

25 appropriate antigen as described for in vitro stimulation of the CTLs. 
For Balb/c mice the PI 8 peptide 

(ArglleHisIleGlyProGlyArgAlaPheTyrThrThrLysAsn [SEQ.ID:37], for 
HIV MN strain) can be used at 10 pM concentration to restimulate CTL 
^ ^U"? using irradiated syngeneic splenocytes and can be used to 
30 sensitize target cells during the cytotoxicity assay at 1 -1 0 pM by 

incubation at 370C for about two hours prior to the assay. For these H- 
2d MHC haplotype mice, the murine mastocytoma cell line, P815, 
provides good target cells. Antigen-sensitized target cells are loaded 
with Na5lCr04, which is released from the interior of the target cells 
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upon killing by CTL, by incubation of targets for 1-2 hours at 370C 
(0.2 mCi for ~5 x 106 cells) followed by several washings of the target 
cells. CTL populations are mixed with target cells at vaiying ratios of 
effectors to targets such as 100:1, 50:1, 25:1, etc., pelleted together, and 
5 incubated 4-6 hours at 370C before harvest of the supematants which 
are then assayed for release of radioactivity using a gamma counter. 
Cytotoxicity is calculated as a percentage of total releasable counts from 
the target cells (obtained using 0.2% Triton X-100 treatment) from 
which spontaneous release from target cells has been subtracted. 

10 

EXAMPLE )4 

Assay For HIV Specific Antihndips- 

ELISA were designed to detect antibodies generated against 
HIV using either specific recombinant protein or synthetic peptides as 

15 substrate antigens. 96 well microliter plates were coated at 4oC 
overnight with recombinant antigen at 2 pg/ml in PBS (phosphate 
buffered saline) solution using 50 pl/well on a rocking platform. 
Antigens consisted of either recombinant protein (gpl20, rev. Repligen 
Corp.; gpl60, gp41: American Bio-Technologies, Inc.) or synthetic 

20 peptide (V3 peptide corresponding to virus isolate sequences from lUB, 
etc.: American Bio-Technologies, Inc.; gp41 epitope for monoclonal 
antibody 2F5). Plates were rinsed four times using wash buffer 
(PBS/0.05% Tween 20) followed by addition of 200pl/well of blocking 
buffer (1% Carnation milk solution in PBS/0.05% Tween-20) for 1 hr 

25 at room temperature with rocking. Pre-sera and immune sera were 
diluted in blocking buffer at the desired range of dilutions and 100 pi 
added per well. Plates were incubated for 1 hr at room temperature 
with rocking and then washed four times with wash buffer. Secondary 
antibodies conjugated with horse radish peroxidase, (anti-rhesus Ig, 

30 Southern Biotechnology Associates; anti- mouse and anti-rabbit Igs, 

Jackson Inununo Research) diluted 1:2000 in blocking buffer, were then 
added to each sample at 100 pl/well and incubated 1 hr at room 
temperamre with rocking. Plates were washed 4 times with wash buffer 
and then developed by addition of 100 pl/well of an o-phenylenediamine 
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(o-PD, Calbiochem) solution at 1 mg/ml in 100 mM citrate buffer at pH 
4.5. Plates were read for absoibance at 450 nm both kinetically (first 
ten minutes of reaction) and at 10 and 30 minute endpoints (Thermo- 
max microplate reader. Molecular Devices). 

5 

EXAMPLE 15 
Assay For HIV Neutralizing Antihndie.^- 

vitro neutralization of HIV isolates assays using sera 
derived from vaccinated animals was performed as follows. Test sera 

10 and pre-immune sera were heat inactivated at 560c for 60 min before 
use. A titrated amount of HIV-l was added in 1 :2 serial dilutions of test 
sera and incubated 60 min at room temperature before addition to 10^ 
MT-4 human lymphoid cells in 96 well microliter plates. The virus/cell 
mixtures were incubated for 7 days at 370C and assayed for virus- 

15 mediated killing of cells by staining cultures with tetrazolium dye. 

Neutralization of vims is observed by prevention of virus-mediated cell 
death. 

EXAMPLE If; 

20 Isolation Of Genes Fmm Tli^j cal HIV knlate^;- 

HIV viral genes were cloned from infected PBMC's which 
had been activated by ConA treatment. Hie preferred method for 
obtaining the viral genes was by PGR amplification from infected 
cellular genome using specific oligomers flanking the desired genes. A 

25 second method for obtaining viral genes was by purification of viral 
RNA from the supematants of infected cells and preparing cDNA from 
this material with subsequent PGR. This method was very analogous to 
that described above for cloning of the murine B7 gene except for the 
PGR oligomers used and random hexamers used to make cDNA rather 

30 than specific priming oligomers. 

Genomic DNA was purified from infected cell pellets by 
lysis in STE solution (10 mM NaGl. 10 mM EDTA, 10 mM Tris-HGl, 
pH 8.0) to which Proteinase K and SDS were added to 0.1 mg/ml and 
0.5% fmal concentrations, respectively. This mixture was incubated 
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overnight at 560C and extracted with 0.5 volumes of 
phenol.-chloroformiisoamyl alcohol (25:24:1). The aqueous phase was 
then precipitated by addition of sodium acetate to 0.3 M final 
concentration and two volumes of cold ethanol. After pelleting the 
5 DNA from solution the DNA was rcsuspended in O.IX TE solution (IX 
TE = 10 mM Tris-HCl, pH 8.0, 1 mM EDTA). At this point SDS was 
added to 0.1 % with 2 U of RNAse A with incubation for 30 minutes at 
370c. This solution was extracted with phenol/chloroform/isoamyl 
alcohol and then precipitated with ethanol as before. DNA was 
10 suspended in 0.1 X TE and quantitated by measuring its ultraviolet 
absorbance at 260 nm. Samples were stored at -20oc until used for 
PCR. 

PGR was performed using the Perkin-Ehner Cetus kit and 
procedure using the following sense and antisense oligomers for gpl60: 

15 5-0 A AAG AGC AG A AG A CAG TGG CAA TGA -3' (SEQ.ID:38) 
and 5 -GGG CTT TGC TAA ATG GGT GGC AAG TGG CCC GGG C 
ATG TGG-3* (SEQ.ID:39), respectively. These oligomers add an Srfl 
site at the 3'-terminus of the resulting DNA fragment. PCR-derived 
segments are cloned into either the VlJns or VIR vaccination vectors 

20 and V3 regions as well as ligation junction sites confirmed by DNA 
sequencing. 

EX AMP! .R 17 

T Cell Proliferatinn A^^y^- 

25 PBMCs are obtained and tested for recall responses to 

specific antigen as determined by proliferation within the PBMC 
population. Proliferation is monitored using 3H-thymidine which is 
added to the cell cultures for the last 18-24 hours of incubation before 
harvest. Cell harvesters retain isotope-containing DNA on filters if 

30 proliferation has occurred while quiescent cells do not incorporate the 
isotope which is not retained on the filter in free form. For either 
rodent or primate species 4 X 105 cells are plated in 96 well microtiter 
plates in a total of 200 pi of complete media (RPMI/10% fetal calf 
serum). Background proHferation responses are determined using 
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PBMCs and media alone while nonspecific responses arc generated by 
using lectins such as phytohaemagglutin (PHA) or concanavalin A 
(ConA) at 1- 5 pg/ml concentrations to serve as a positive control. 
Specific antigen consists of either known peptide epitopes, purified 

5 protein, or inactivated virus. Antigen concentrations range from 1-10 
pM for peptides and 1-10 pg/ml for protein. Lectin-induced 
proliferation peaks at 3-5 days of cell culture incubation while antigen- 
specific responses peak at 5-7 days. Specific proliferation occurs when 
radiation counts are obtained which are at least three-fold over the 

10 media background and is often given as a ratio to background, or 

Stimulation Index (SI). HIV gpl60 is known to contain several peptides 
known to cause T cell proliferation of gpl60/gpl20 immunized or HIV- 
infected individuals. The most commonly used of these are: 
Tl (LysGlnllelleAsnMetTrpGbiGluValGlyLysAlaMetTyrAIa 

1 5 [SEQ.ID:40]); T2 (HisGluAspIlelleSerLeuTrp AspGlnSerLeuLys 

[SEQ.ID:41]; and, TH4 (AspArgVallleGluValValGhiGlyAlaTyrArgAla 
IleArg [SEQ.ID:42]). These peptides have been demonstrated to 
stimulate proliferation of PBMC from antigen-sensitized mice, 
nonhuman primates, and humans. 

20 

EXAMPLE IS 

Vector VIR Preparation: 

In an effort to continue to optimize our basic vaccination 
vector, we prepared a derivative of VlJns which was designated as VIR. 

25 The purpose for this vector constmction was to obtain a minimum-sized 
vaccine vector, i.e., without unnecessary DN A sequences, which still 
retained the overall optimized heterologous gene expression 
characteristics and high plasmid yields that VIJ and VlJns afford. We 
determined from the literature as well as by experiment that (1) regions 

30 within the pUC backbone comprising the E. coli origin of repUcation 
could be removed without affecting plasmid yield from bacteria; (2) the 
3*-region of the kan^ gene following the kanamycin open reading frame 
could be removed if a bacterial terminator was inserted in its stead; and, 
(3) -300 bp from the 3'- half of the BGH terminator could be removed 
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without affecting its regulatory function (following the original Kpnl 
restriction enzyme site within the BGH element). 

VIR was constructed by using PGR to synthesize three 
segments of DNA from VlJns representing the CMVintA 
promoter/BGH terminator, origin of replication, and kanamycin 
resistance elements, respectively. Restriction enzymes unique for each 
segment were added to each segment end using the PGR oligomers: Sspl 
and Xhol for GMVintA/BGH; EcoRV and BamHI for the kan r gene; 
and. Bell and Sail for the ori r. These enzyme sites were chosen 
because they allow directional ligation of each of the PGR-derived DNA 
segments with subsequent loss of each site: EcoRV and Sspl leave blunt- 
ended DNAs which are compatible for ligation while BamHI and Bell 
leave complementary overhangs as do Sal! and Xhol. After obtaining 
these segments by PGR each segment was digested with the appropriate 
restriction enzymes indicated above and then ligated together in a single 
reaction mixture containing all three DNA segments. The 5'-end of the 
ori r was designed to include the T2 rho independent terminator 
sequence that is normally found in this region so that it could provide 
termination information for the kanamycin resistance gene. The ligated 
product was confirmed by restriction enzyme digestion (>8 enzymes) as 
well as by DNA sequencing of the ligation junctions. DNA plasmid 
yields and heterologous expression using viral genes within VIR appear 
similar to VlJns. The net reduction in vector size achieved was 1346 bp 
(VlJns = 4.86 kb; VIR = 3.52 kb). [SEQ.ID:43 of this specification; 
also see Figure 1 1 and SEQ ID: 100 of W095/24485; PCT International 
Application No. PCTAJS95/02633]. 

PGR oligomer sequences used to synthesize VIR 
(restriction enzyme sites are underlined and identified in brackets 
following sequence): 

(1) 5 -GGT AGA AAT ATT GG GTA TTG GGG ATT GGA TAG G-3' 
[Sspl], (SEQ.ID:44):, 
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10 



(2) 5 -CCA CAT CTCGAG GAA CCG GGT CAA TTC TTC AGC 
ACC-3' [Xhol], (SEQ.ID:45): 

(for CMVintA/BGH segment) 

(3) 5'.GGT ACA GAT ATC GGA AAG CCA CGT TGT GTC TCA 
AAA TC-3'[EcoRV], (SEQ.ID:46): 

(4) 5 -CCA CAT GGA TCC G TAA TGC TCT GCC AGT GTT ACA 
ACC-3' [BamHI], (SEQ.ID:47): 

(for kanamycin resistance gene segment) 



(5) 5'-GGT ACA TGA TCA CGT AGA AAA GAT CAA AGG ATC 
TTC TTG-3'[BclIJ, (SEQ.ID:48):, 

(6) 5'-CCA CAT GTC GAC CC GTA AAA AGG CCG CGT TGC 
TGG-3' fSall], (SEQ.ID:49): 

15 (for E. coli origin of replication) 

Ligation junctions were sequenced for VI R using the 
following oligomers: 

5'-GAG CCA ATA TAA ATG TAC-3' (SEQ.ID:50): 
20 f CM Vint A/kanr junction] 

5'-CAA TAG CAG GCA TGC-3' (SEQ.ID:51): [BGH/ori 

junction] 

5'-G CAA GCA GCA GAT TAC-3' (SEQ.ID:52): [ori/kanr 

junction] 

25 

EXAMPLE 19 
Heterologous Expression of HIV Lat e Gene Products 

HIV stmctural genes such as env and gag require 
expression of the HIV regulatory gene, rev, in order to efficiently 
30 produce full-length proteins. We have found that rev-dependent 

expression of gag yielded low levels of protein and that rev itself may 
be toxic to cells. Although we achieved relatively high levels of rev- 
dependent expression of gpl60 in vitro this vaccine elicited low levels of 
antibodies to gpl60 following in vivo immunization with rcv/gpl60 
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DNA. This may result from known cytotoxic effects of rev as well as 
increased difficulty in obtaining rev function in myotubules containing 
hundreds of nuclei (rev protein needs to be in the same nucleus as a rev- 
dependent transcript in order for %ag or env protein expression to 
5 occur). However, it has been possible to obtain rev-independent 
expression using selected modifications of the env gene. 

1. rev-indeoendent expressi on of ewv- 

In general, our vaccines have utilized primarily HIV (IIIB) 

10 env and genes for optimization of expression within our generalized 
vaccination vector, VlJns, which is comprised of a CMV immediate- 
eariy (IE) promoter, a BGH-derived polyadenylation and transcriptional 
termination sequence, and a pUC backbone. Varying efficiencies, 
depending upon how large a gene segment is used (e.g., gpl20 vs. 

15 gp 1 60), of rev-independent expression may be achieved for env by 
replacing its native secretory leader peptide with that from the tissue- 
specific plasminogen activator (tPA) gene and expressing the resulting 
chimeric gene behind the CMVIE promoter with the CMV intron A. 
tPA-gpl20 is an example of a secreted gpl20 vector constructed in this 

20 fashion which functions well enough to elicit anti-gpl 20 immune 
responses in vaccinated mice and monkeys. 

Because of reports that membrane-anchored proteins may 
induce much more substantial (and perhaps more specific for HIV 
neutralization) antibody responses compared to secreted proteins as well 

25 as to gain additional epitopes, we prepared V 1 Jns-tPA-gp 1 60 and V 1 Jns- 
rev/gpl60. The tPA-gpl60 vector produced detectable quantities of 
gp 160 and gpI20, without the addition of rev, as shown by immunoblot 
analysis of transfected cells, although levels of expression were much 
lower than that obtained for rev/gpl60, a rev-dependent gpl60- 

30 expressing plasmid. This is probably because inhibitory regions, which 
confer rev dependence upon the gpl60 transcript, occur at multiple sites 
within gpl60 including at the COOH-terminus of gp41. A vector was 
prepared for a COOH-tcrminally truncated form of tPA-gpl60 (tPA- 
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gpl43) which was designed to increase the overall expression levels of 
env by elimination of these inhibitory sequences. The gpl43 vector also 
ehminates intracellular gp41 regions containing peptide motifs (such as 
Leu-Leu) known to cause diversion of membrane proteins to the 
5 lysosomes rather than the cell surface. Thus, gpl43 may be expected to 
have increased levels of expression of the env protein (by decreasing 
rev-dependence) and greater efficiency of transport of protein to the 
cell surface compared to full-length gpl60 where these proteins may be 
better able to elicit anti-gpl60 antibodies following DNA vaccination. 

10 tPA-gpl43 was further modified by extensive silent mutagenesis of the 
rev response element (RRE) sequence (350 bp) to eliminate additional 
inhibitory sequences for expression. This constmct, gpl43/mutRRE, 
was prepared in two fomis: either eliminating (form A) or retaining 
(form B) proteolytic cleavage sites for gpl20/41. Both forms were 

15 prepared because of literature reports that vaccination of mice using 
uncleavable gpl60 expressed in vaccinia elicited much higher levels of 
antibodies to gpl60 than did cleavable forais. 

A quantitative ELISA for gpl60/gpl20 expression in cell 
transfectants was developed to determine the relative expression 

20 capabilities for these vectors. In vitro transfection of 293 cells followed 
by quantification of cell-associated vs. secreted/released gpl20 yielded 
the following results: (1) tPA-gpl60 expressed 5-1 OX less gpl20 than 
rev/gpl60 with similar proportions retained intracellularly vs. released 
from the cell surface; (2) tPA-gpl43 gave 3-6X greater secretion of 

25 gp 1 20 than rev/gp 1 60 with only low levels of cell-associated gp 1 43 , 
confirming that the cytoplasmic tail of gpl60 causes intracellular 
retention of gpl60 which can be overcome by partial deletion of this 
sequence; and, (3) tPA-gpl43/mutRRE A and B gave ~10X greater 
expression levels of protein than did parental tPA-gpI43 while 

30 elimination of proteolytic processing was confirmed for fonm A. 

Thus, our strategy to increase rev-independent expression 
has yielded stepwise increases in overall expression levels as well as 
redirecting membrane-anchored gpl43 to the cell surface away from 
lysosomes. It is important to note that this is a generic construct into 
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which it should be possible to insert gpl20 sequences derived from 
various primary viral isolates within a vector cassette containing these 
modifications which reside either at the NH2-tenninus (tPA leader) or 
COOH-teraiinus (gp41), where few antigenic differences exist between 
5 different viral strains. 

Figures 2-7 present data supporting the use of various 
constructs, including but not limited to a gpl43-based construct, and 
preferably a tPA-gpl43 based construct, as a DNA vaccine against HIV 
infection. Figure 2 shows that tPA-143 (opt41) elicits an anti-gpl20 

10 antibody response in the in the range of GMT=10\ Figure 3 measures 
and compares anti-gpl20 antibody titers for several DNA vaccines, 
including gpl43-based constmcts. Figure 4 shows the relative 
expression of tPA-gpl43 and tPA-143/mutRRE in comparison to the 
tPA-gpl60 construct. Figure 5 measures generation of anti-gpl20 

15 antibodies for both the optA and optB fomis of tPA-gpl43 constructs. 
Figure 6 shows the ability of several DNA vaccines, including tPA- 
gpl43-optA and tPA-gpl43-optB, to promote generation of neutralizing 
antibodies against HIV strains subsequent to murine DNA vaccination. 
Figure 7 also shows HIV neutralization data for various DNA vaccine 

20 constructions, including tPA-gpl43-optA, tPA-gpl43-optB, tPA-gpl43- 
optA-glyB and tPA-gpl43-optB-glyB. 

2. Expression of ppl20 derived from a clinical isnlatp. ; 

To apply these expression strategies to viruses that are 

25 relevant for vaccine purposes and confirm the generality of our 
approaches, we also prepared a tPA-gpl20 vector derived from a 
primary HIV isolate (containing the North American concensus V3 
peptide loop; macrophage-tropic and nonsyncytia-inducing phenotypes). 
This vector gave high expression/secretion of gpl20 with transfected 

30 293 cells and elicited anti-gpl20 antibodies in mice thus demonstrating 
that it was cloned in a functional form. Primary isolate gpl60 genes 
will also be used for expression in the same way as for gpl60 derived 
from laboratory strains. 
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3. Immune Responses to HIV-1 env Polynucleotide Vaccines 

Effect of vaccination route on immune responses in mice: 
While efforts to improve expression of gpl60 are ongoing, we have 
utiUzed the tPA-gpl20 DNA construct to assess immune responses and 

5 ways to augment them, bitramuscular (i.m.) and intradermal (i.d.) 
vaccination routes were compared for this vector at 100, 10, and 1 pg 
doses in mice. Vaccination by either route elicited antibody responses 
(GMTs = lO^-lO^) in all recipients following 2-3 vaccinations at all 
three dosage levels. Each route elicited similar anti-gpl20 antibody 

10 titers with clear dose-dependent responses. However, we observed 
greater variability of responses for i.d. vaccination, particularly at the 
lower doses following the initial inoculation. Moreover, helper T-cell 
responses, as determined by antigen-specific in vitro proliferation and 
cytokine secretion, were higher following i.m. vaccination than i.d. We 

15 concluded that i.d. vaccination did not offer any advantages compared to 
i.m. for this vaccine. 

4. ppl20 DNA vaccine-mediated helper T cell immunity in mice : 

gp 1 20 DNA vaccination produced potent helper T-cell 
20 responses in all lymphatic compartments tested (spleen, blood, inguinal, 
mesenteric, and iliac nodes) with THl-Hke cytokine secretion profiles 
(i.e., g-interferon and IL-2 production with little or no IL-4). These 
cytokines generally promote strong cellular immunity and have been 
associated with maintenance of a disease-free state for HIV-seropositive 
25 patients. Lymph nodes have been shown to be primary sites for HIV 
replication, harboring large reservoirs of virus even when virus cannot 
be readily detected in the blood. A vaccine which can ehcit anti-HIV 
immune responses at a variety of lymph sites, such as we have shown 
with our DNA vaccine, may help prevent successful colonization of the 
30 lymphatics following initial infection. 

5. env DNA vaccine-mediated antibodv responses : 

African green (AGM) and Rhesus (RHM) monkeys which 
received gpl20 DNA vaccines showed low levels of neutralizing 
35 antibodies following 2-3 vaccinations, which could not be increased by 
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additional vaccination. These results, as well as increasing awareness 
within the HIV vaccine field that oligomeric gpl60 is probably a more 
relevant target antigen for eliciting neutralizing antibodies than gpl20 
monomers, have led us to focus upon obtaining effective expression of 
gpl60-based vectors (see above). Mice and AGM were also vaccinated 
with the primary isolate derived tPA-gpl20 vaccine. These animals 
exhibited anti-V3 peptide (using homologous sequence) reciprocal 
endpoint antibody titers ranging 500-50(X), demonstrating that this 
vaccine design is functional for clinically relevant viral isolates. 

The gpl60-based vaccines, rev-gpl60 and tPA-gpl60, 
failed to consistently elicit antibody responses in mice and nonhuman 
primates or yielded low antibody titers. Our initial results with the 
tPA-gpl43 plasmid yielded geometric mean titers (GMT) > 103 in mice 
and AGM following two vaccinations. These data indicate that we have 
signficantly improved the immunogenicity of gpl60-like vaccines by 
increasing expression levels and more efficient intracellular trafficking 
of env to the cell surface. This construct, as well as the tPA- 
gpl43/mutRRE A and B vectors, will continue to be characterized for 
antibody responses, especially for virus neutralization. 

6- cnv DNA vaccine-mediatPH PTL responses \ry mnnfr^y"- 

We continued to characterize CTL responses of RHM that 
had been vaccinated with gpl20 and gpl60/IRES/rcv DNA. All four 
monkeys that received this vaccine showed significant MHC Class I- 
restricted CTL activities (20-35% specific killing at an effector/target = 
20) following two vaccinations. Following a fourth vaccination these 
activities increased to 50-60% killing under similar test condirions. 
indicating that additional vaccination boosted responses significantly. 
The CTL activities have persisted for at least seven months subsequent 
to the final vaccination at about 50% of their peak levels indicating that 
long-term memory had been established. 
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EXAMPLE 20 

giy/Hlv (SHJV) Chiimgras; 

A major obstacle for testing the protective efficacy of 
candidate HIV-l vaccines has been the lack of a suitable animal 

5 challenge model for this virus. Although the simian immunodeficiency 
vims (STV), which is closely related to HIV, is infectious and causes 
AIDS in rhesus monkeys, the only animal species which can be infected 
with HIV-l viral isolates is the chimpanzee. However, the resulting 
viremia from this infection is low-level, transient, and no pathogenic 

10 effects (e.g., lymphopenia, immunodeficiency-related opportunistic 
infections, etc.) develop. Recently, hybrid vimses comprised of SIV 
and HIV genomes have been developed which are also infectious to 
rhesus monkeys and which can cause infection-related AIDS. An 
example of this type of vims is SHIV-4 (lUB) (Li et al., J. of Acquired 

15 Immune Deficiency Syndrome, Vol. 5, 639-646 (1992)). This vims 
contains the SIV (MAC239) genome except for the regulatory genes, tat 
and rev, and the structural gene, env. Because the principle component 
of candidate HIV vaccines is based upon env this vims allows testing 
vaccines developed for human clinical purposes for protective efficacy 

20 against infection in an animal model. 

EXAMPLE 21 

Plasmid DNA and Recombinant Protein Comhinati on Vaccines! 

Vaccines having both a plasmid DNA HIV env component 

25 and a recombinant HIV env protein component were tested for their 
abihties to induce antibody responses in rhesus monkeys. Figure 9 and 
Figure 10 show the resulting anti-gpl20 ELISA antibody and SHIV-4 
(niB) vims neutralizing antibody titers, respectively, following 
vaccination of ihesus with HIV env gene-containing DNA vaccines and 

30 recombinant protein (formulated in an appropriate adjuvant). These 
monkeys developed high titers of cnv-specific antibodies and 
neutralizing antibodies. Control monkeys, vaccinated with "blank" 
DNA that did not contain a gene and ovalbumin did not develop any 
detectable c/iv-specific responses while monkeys vaccinated only with 
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the protein component of this vaccine showed low levels of antigen- 
specific antibodies detected by ELISA and no neutralizing antibodies. 
When these monkeys were challenged with SHIV-4 (IIIB) vims all 
control and protein only monkeys became infected while those receiving 
both env DNA and protein did not develop a detectable SHIV viremia. 
These monkeys are currently being tested periodically for possible 
delayed onset of infection. 
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(A) NAME: HAND, J. MARK 

(B) REGISTRATION NUMBER: 36,545 

(C) REFERENCE/DOCKET NUMBER: 19729Y PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 908-594-3905 

(B) TELEFAX: 908-594-4720 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4864 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: both 
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(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA 60 

CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 120 

TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGC 180 

ACCATATGCG GTGTGAAATA CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG 240 

CTATTGGCCA TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 300 

TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT AATCAATTAC 3 60 

GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT ACATAACTTA CGGTAAATGG 420 

CCCGCCTGGC TGACCGCCCA ACGACCCCCG CCCATTGACG TCAATAATCA CGTATCTTCC 480 

CATAGTAACG CCAATAGGGA CTTTCCATTG ACGTCAATGG GTGGAGTATT TACGGTAAAC 54 0 

TGCCCACTTG GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 600 

TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG ACTTTCCTAC 660 

TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACCATG GTGATGCGGT TTTGGCAGTA 720 

CATCAATGGG CGTGGATAGC GGTTTGACTC ACGGGGATTT CCAAGTCTCC ACCCCATTCA 780 

CGTCAATGGG AGTTTGTTTT GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA 840 

CTCCGCCCCA TTGACGCAAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 900 

AGCTCGTTTA GTGAACCGTC AGATCGCCTG GAGACGCCAT CCACGCTGTT TTGACCTCCA 960 

TAGAAGACAC CGGGACCGAT CCAGCCTCCG CGGCCGGGAA CGGTGCATTG GAACGCGGAT 1020 

TCCCCGTGCC AAGAGTGACG TAAGTACCGC CTATAGAGTC TATAGGCCCA CCCCCTTGGC 1080 

TTCTTATGCA TGCTATACTG TTTTTGGCTT GGGGTCTATA CACCCCCGCT TCCTCATGTT 1140 

ATAGGTGATG GTATAGCTTA GCCTATAGGT GTGGGTTATT GACCATTATT GACCACTCCC 1200 

CTATTGGTGA CGATACTTTC CATTACTAAT CCATAACATG GCTCTTTGCC ACAACTCTCT 1260 

TTATTGGCTA TATGCCAATA CACTGTCCTT CAGAGACTGA CACGGACTCT GTATTTTTAC 1320 

AGGATGGGGT CTCATTTATT ATTTACAAAT TCACATATAC AACACCACCG TCCCCAGTGC 1380 

CCGCAGTTTT TATTAAACAT AACGTGGGAT CTCCACGCGA ATCTCGGGTA CGTGTTCCGG 1440 
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ACATGGGCTC TTCTCCGGTA GCGGCGGAGC TTCTACATCC GAGCCCTGCT CCCATGCCTC 1500 

CAGCGACTCA TGGTCGCTCG GCAGCTCCTT GCTCCTAACA GTGGAGGCCA GACTTAGGCA 1560 

CAGCACGATG CCCACCACCA CCAGTGTGCC GCACAAGGCC GTGGCGGTAG GGTATCTGTC 1620 

TGAAAATGAG CTCGGGGAGC GGGCTTGCAC CGCTGACGCA TTTGGAAGAC TTAAGGCAGC 1680 

GGC AG AAGAA GATGCAGGCA GCTGAGTTGT TGTGTTCTGA TAAGAGTCAG AGGTAACTCC 1740 

CGTTGCGGTG CTGTTAACGG TGGAGGGCAG TGTAGTCTGA GCAGTACTCG TTGCTCCCGC 1800 

GCGCGCC ACC AGACATAATA GCTCACAGAC TAACAGACTG TTCCTTTCCA TGGGTCTTTT 1860 

CTGCAGTCAC CGTCCTTAGA TCTGCTGTGC CTTCTAGTTG CCAGCCATCT GTTGTTTGCC 1920 

CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTCC CACTGTCCTT TCCTAATAAA 1980 

ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTCGGG GGTGGGGTGG 2040 

GGCAGCACAG CAAGGGGGAG GATTGGGAAG ACAATAGCAG GCATGCTGGG GATGCGGTGG 2100 

GCTCTATGGG TACCCAGGTG CTGAAGAATT GACCCGGTTC CTCCTGGGCC AGAAAGAAGC 2160 

AGGCACATCC CCTTCTCTGT GACACACCCT GTCCACGCCC CTGGTTCTTA GTTCCAGCCC 2220 

CACTCATAGG ACACTCATAG CTCAGGAGGG CTCCGCCTTC AATCCCACCC GCTAAAGTAC 2280 

TTGGAGCGGT CTCTCCCTCC CTCATCAGCC CACCAAACCA AACCTAGCCT CCAAGAGTCG 2340 

GAAGAAATTA AAGCAAGATA GGCTATTAAG TGCAGAGGGA GAGAAAATCC CTCCAACATG 2400 

TGAGGAAGTA ATGAGAGAAA TCATAGAATT TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 2460 

CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA TCAGCTCACT CAAAGGCGGT AATACGGTTA 2520 

TCCACAGAAT CAGGGGATAA CGCAGGAAAG AACATGTGAG CAAAAGGCCA GCAAAAGGCC 2580 

AGGAACCGTA AAAAGGCCGC GTTGCTGGCG TTTTTCCATA GGCTCCGCCC CCCTGACGAG 2640 

CATCACAAAA ATCGACGCTC AAGTCAGAGG TGGCGAAACC CGACAGGACT ATAAAGATAC 27 00 

CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG CGCTCTCCTG TTCCGACCCT GCCGCTTACC 2760 

GGATACCTGT CCGCCTTTCT CCCTTCGGGA AGCGTGGCGC TTTCTCAATG CTCACGCTGT 2820 

AGGTATCTCA GTTCGGTGTA GGTCGTTCGC TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 2880 

GTTCAGCCCG ACCGCTGCGC CTTATCCGGT AACTATCGTC TTGAGTCCAA CCCGGTAAGA 2940 

CACGACTTAT CGCCACTGGC AGCAGCCACT GGTAACAGGA TTAGCAGAGC GAGGTATGTA 3000 
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GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG CCTAACTACG GCTACACTAG AAGGACAGTA 3 060 

TTTGGTATCT GCGCTCTGCT GAAGCCAGTT ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 3120 

TCCGGCAAAC AAACCACCGC TGGTAGCGGT GGTTTTTTTG TTTGCAAGCA GCAGATTACG 3180 

CGCAGAAAAA AAGGATCTCA AGAAGATCCT TTGATCTTTT CTACGGGGTC TOACGCTCAG 3240 

TGGAACGAAA ACTCACGTTA AGGGATTTTG GTCATGAGAT TATCAAAAAG GATCTTCACC 33 00 

TAGATCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT AAAGTATATA TCAGTAAACT 33 60 

TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA TCTCAGCGAT CTGTCTATTT 3420 

CGTTCATCCA TAGTTGCCTG ACTCCGGGGG GGGGGGGCGC TCAGGTCTGC CTCGTGAAGA 3480 

AGGTGTTGCT GACTCATACC AGGCCTGAAT CGCCCCATCA TCCAGCCAGA AAGTCAGGGA 3540 

GCCACGGTTG ATGAGAGCTT TGTTGTAGGT GGACCAGTTG GTGATTTTGA ACTTTTGCTT 3 600 

TGCCACGGAA CGGTCTGCGT TGTCGGGAAG ATGCGTGATC TCATCCTTCA ACTCAGCAAA 3 660 

AGTTCGATTT ATTCAACAAA GCCGCCGTCC CGTCAAGTCA GCGTAATGCT CTGCCAGTGT 3720 

TACAACCAAT TAACCAATTC TGATTAGAAA AACTCATCGA GCATCAAATG AAACTGCAAT 3780 

TTATTCATAT CAGGATTATC AATACCATAT TTTTCAAAAA GCCGTTTCTG TAATGAAGGA 3 840 

GAAAACTCAC CGAGGCAGTT CCATAGGATG GCAAGATCCT GGTATCGGTC TGCGATTCCG 3900 

ACTCGTCCAA CATCAATACA ACCTATTAAT TTCCCCTCGT CAAAAATAAG GTTATCAAGT 3960 

GAGAAATCAC CATGAGTGAC GACTGAATCC GGTGAGAATG GCAAAAGCTT ATCCATTTCT 4 020 

TTCCAGACTT GTTCAACAGG CCAGCCATTA CGCTCGTCAT CAAAATCACT CGCATCAACC 4 080 

AAACCGTTAT TCATTCGTGA TTGCGCCTGA GCGAGACGAA ATACGCGATC GCTGTTAAAA 4140 

GGACAATTAC AAACAGGAAT CGAATGCAAC CGGCGCAGGA ACACTGCCAG CGCATCAACA 4200 

ATATTTTCAC CTGAATCAGG ATATTCTTCT AATACCTGGA ATGCTGTTTT CCCGGGGATC 4260 

GCAGTGGTGA GTAACCATGC ATCATCAGGA GTACGGATAA AATCCTTCAT GGTCGGAAGA 4320 

GGCATAAATT CCGTCAGCCA GTTTAGTCTG ACCATCTCAT CTGTAACATC ATTGGCAACG 4380 

CTACCTTTGC CATGTTTCAG AAACAACTCT GGCGCATCGG GCTTCCCATA CAATCGATAG 4440 

ATTGTCGCAC CTGATTGCCC GACATTATCG CGAGCCCATT TATACCCATA TAAATCAGCA 4500 

TCCATGTTGG AATTTAATCG CGGCCTCGAG CAAGACGTTT CCCGTTGAAT ATGGCTCATA 4560 
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ACACCCCTTG TATTACTGTT TATGTAAGCA GACAGTTTTA TTGTTCATGA TCATATATTT 4620 

TTATCTTGTG CAATGTAACA TCAGAGATTT TGAGACACAA CGTCGCTTTC CCCCCCCCCC 4680 

CATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT TGAATCTATT 4740 

TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTOCC ACCTCACGTC 4800 

TAAGAAACCA TTATTATCAT GACATTAACC TATAAAAATA OGCGTATCAC GAGGCCCTTT 4860 

^^T^ 4864 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = -oligonucleotide" 

{xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

GATCACCATG GATGCAATGA AGAGAGGGCT CTGCTGTGTG CTGCTGCTCT GTGGAGCAGT 60 
CTTCGTTTCG CCCAGCGA 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid* 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GATCTCGCTG GGCGAAACGA AGACTGCTCC ACACAGCAGC AGCACACAGC AGAGCCCTCT 
CTTCATTGCA TCCATGGT 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CCCCGGATCC TGATCACAGA AAAATTGTGG GTCACAGTC 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CCCCAGGAAT CCACCTGTTA GCGCTTTTCT CTCTGCACCA CTCTTCTC 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGTACATGAT CACAGAAAAA TTGTGGGTCA CAGTC 



(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

CCACATTGAT CAGATATCTT ATCTTTTTTC TCTCTGCACC ACTCTTC 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
Thr Asn Trp Leu Trp Tyr lie Lys 

1 5 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg 
^5 10 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 10: 

Lys Ala Gin Asn His Val Val Gin Asn Glu His Gin 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTGAAAGACC AGCAACTCCT AGGGAATTTG GGGTTGCTCT GG 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = -oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CGCAGGGGAG GTGGTCTAGA TATCTTATTA TTTTATATAC CACAGCCAAT TTGTTATG 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGTACACCTA GGCATCTGGG GCTGCTCTGG 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CCACATGATA TCGCCCGGGC TTATTATTTG ATGTACCACA GCCAGTTGGT GATG 54 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = " oligonucleotide " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGTACACTGC AGTCACCGTC CTATGGCAGG AAGAAGCGGA GAC 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = 'oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCACATCAGG TACCCCATAA TAGACTGTGA CC 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = ■oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17; 

GGTACATGAT CAACCATGAG AGTGAAGGAG AAATATCAGC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CCACATTGAT CAGATATCCC CATCTTATAG CAAAATCCTT TCC 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CCACATTGAT CAGATATCCC CATCTTATAG CAAAATCCTT TCC 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CCTGTGTGTG AGTTTAAACT GCACTGATTT GAAGAATGAT ACTAATAC 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

GGTACATGAT CACAGAAAAA TTGTGGGTCA CAGTC 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CCACATTGAT CAGCCCGGGC TTAGGGTGAA TAGCCCTGCC TCACTCTGTT CAC 



. (2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
CTGAAAGACC AGCAACTCCT AGGGATTTGG GGTTGCTGTG G 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = -oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

CCACATTGAT CAGCCCGGGC TTAGGGTGAA TAGCCCTGCC TCACTCTGTT CAC 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

GGTACACAAT TGGAGGAGCG AGTTATATAA ATATAAG 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

CCTGTGTGTG AGTTTAAACT GCACTGATTT GAAGAATGAT ACTAATAC 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 iunino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Hi) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

Asn Arg Leu lie Lys Ala 

1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide* 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28: 

CCACATGATA TCGCCCGGGC TTATTAGGCC TTGATCAGCC GGTTCACAAT GGACAGCACA 60 

GC 62 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

CTGACCCCCC TGTGTGTGGG GGCTGGCAGT TGTAACACCT CAGTCATTAC ACAG 



(2) INFORMATION FOR SEQ ID NO: 30: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LEI4GTH: 305 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TGATCACAGA GAAGCTGTGG GTGACAGTGT ATTATGGCGT GCCAGTCTGG AAGGAGGCCA 60 

CCACCACCCT GTTCTGTGCC TCTGATGCCA AGGCCTATGA CACAGAGGTG CACAATGTGT 120 

GGGCCACCCA TGCCTGTGTG CCCACAGACC CCAACCCCCA GGAGGTGGTG CTCGTCAATG 180 

TGACTGAGAA CTTCAACATG TGGAAGAACA ACATGGTGGA GCAGATGCAT GAGGACATCA 240 

TCAGCCTGTG GGACCAGAGC CTGAAGCCCT GTGTGAAGCT GACCCCCCTG TGTGTCAGTT 300 

TAAAC 305 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1065 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 31: 

AGTTTAAACT GCACAGACCT GAGGAACACC ACCAACACCA ACAACTCCAC AGCCAACAAC 60 

AACTCCAACT CCGAGGGCAC CATCAAGGGG GGGGAGATGA AGAACTCCTC CTTCAACATG 12 0 

ACCACCTCCA TCAGGGACAA GATGCAGAAG GAGTATGCCC TCCTGTACAA GCTXSGACATT 180 

GTGTCCATTG ACAATOACTC CACCTCCTAC AGGCTGATCT CCTGCAACAC CTCTCTCATC 240 

ACCCAGGCCT GCCCCAAAAT CTCCTTTGAG CCCATCCCCA TCCACTACTG TGCCCCTGCT 300 

GGCTTTGCCA TCCTGAAGTG CAATGACAAG AAGTTCTCTC GCAAGGGCTC CTGCAAGAAT 360 

GTGTCCACAG TGCAGTGCAC ACATGGCATC AGGCCTGTGG TCTCCACCCA GCTCCTGCTC 420 

AATGGCTCCC TGGCTGAGGA GGAGGTGGTC ATCAGGTCTG AGAACTTCAC AGACAATGCC 480 

AAGACCATCA TCGTGCACCT GAATGAGTCT GTGCAGATCA ACTCCACCAG GCCCAACTAC 540 
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AACAAGAGGA AGAGGATCCA CATTGGCCCT GGCAGGGCCT TCTACACCAC CAAGAACATC 
ATTGGCACCA TCAGGCAGGC CCACTGCAAC ATCTCCAGGG CCAAGTGGAA TGACACCCTG 
AGGCAGATTG TGTCCAAGCT GAAGGAGCAG TTCAAGAACA AGACCATTGT GTTCAACCAG 
TCCTCTGGGG GGGACCCTGA GATTGTGATG CACTCCTTCA ACTGTGGGGG GGAGTTCTTC 
TACTGCAACA CCTCCCCCCT GTTCAACTCC ACCTGGAATG GCAACAACAC CTGGAACAAC 
ACCACAGGCT CCAACAACAA CATCACCCTC CAGTGCAAGA TCAAGCAGAT CATCAACATG 
TGGCAGGAGG TGGGCAAGGC CATGTATGCC CCCCCCATTG AGGGCCAGAT CAGGTGCTCC 
TCCAACATCA CAGGCCTGCT GCTGACCAGG GATGGGGGGA AGGACACAGA CACCAACGAC 
ACCGAAATCT TCAGGCCTGG GGGGGGGGAC ATGAGGGACA ATTGG 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 354 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 



GACAATTGGA 


GGAGCGAGTT 


ATATAAATAT 


AAGGTGGTGA 


AGATTGAGCC 


CCTGGGGGTG 


60 


GCCCCAACAA 


AAGCTCAGAA 


CCACGTGGTG 


CAGAACGAGC 


ACCAGGCCGT 


GGGCATTGGG 


120 


GCCCTGTTTC 


TGGGCTTTCT 


GGGGGCTGCT 


GGCTCCACAA 


TGGGCGCCGC 


TAGCATGACC 


180 


CTCACCGTGC 


AAGCTCGCCA 


GCTGCTGAGT 


GGCATCGTCC 


AGCAGCAGAA 


CAACCTGCTC 


240 


CGCGCCATCG 


AAGCCCAGCA 


GCACCTCCTC 


CAGCTGACTG 


TGTGGGGGAT 


CAAACAGCTT 


300 


CAGGCCCGGG 


TGCTGGCCGT 


CGAGCGCTAT 


CTGAAAGACC 


AGCAACTCCT 


AGGC 


354 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 354 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



600 
660 
720 
780 
840 
900 
960 
1020 
1065 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

GACAATTGGA GGAGCGAGTT ATATAAATAT AAGGTGGTGA AGATTGAGCC CCTGGGGGTG 60 

GCCCCAACAA AAGCTAAGAG AAGAGTGGTG CAGAGAGAGA AGAGAGCCGT GGGCkTTGGG 120 

GCCCTGTTTC TGGGCTTTCT GGGGGCTGCT GGCTCCACAA TGGGCGCCGC TAGCATGACC 180 

CTCACCGTGC AAGCTCGCCA GCTGCTGAGT GGCATCGTCC AGCAGCAGAA CAACCTGCTC 240 

CGCGCCATCG AAGCCCAGCA GCACCTCCTC CAGCTGACTG TGTCGGGGAT CAAACAGCTT 3 00 

CAGGCCCGGG TGCTGGCCGT CG AGCGCTAT CTGAAAGACC AGCAACTCCT AGGC 3 54 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



CCTAGGCATC 


TGGGGCTGCT 


CTGGCAAGCT 


GATCTGCACC 


ACAGCTGTGC 


CCTGGAATGC 


60 


CTCCTGGTCC 


AACAAGAGCC 


TGGAGCAAAT 


CTGGAACAAC 


ATGACCTGGA 


TGGAGTGGGA 


120 


CAGAGAGATC 


AACAACTACA 


CCTCCCTGAT 


CCACTCCCTG 


ATTGAGGAGT 


CCCAGAACCA 


180 


GCAGGAGAAG 


AATGAGCAGG 


AGCTGCTGGA 


GCTGGACAAG 


TGGGCCTCCC 


TGTGGAACTG 


240 


GTTCAACATC 


ACCAACTGGC 


TGTGGTACAT 


CAAAATCTTC 


ATCATGATTG 


TGGGGGGCCT 


300 


GGTGGGGCTG 


CGGATTGTCT 


TTGCTGTGCT 


GTCCATTGTG 


AACCGGGTGA 


GACAGGGCTA 


360 


CTCCCCCTAA 


TAAGCCCGGG 


CGATATC 








387 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GCCCGGGCGA TATCTAGACC ACCTCCCCTG CGAGCTAAGC TX^GACAGCCA ATCACGGGTA 60 

AGAGAGTGAC ATTTTTCACT AACCTAAGAC AGGAGGGCCG TCAGAGCTAC TCCCTAATCC 120 

AAAGACGGGT AAAAGTGATA AAAATGTATC ACTCCAACCT AAGACAGGCG CAGCTTCCGA 180 

GGGATTTGTC GTCTGTTTTA TATATATTTA AAAGGGTGAC CTGTCCGGAG CCGTGCTCCC 240 

CGGATGATGT CTTGGGATAT CGCCCGGGC 269 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

GCCCGGGCGA TATCTAGACC ACCTCCCCTG CGAGCTAAGC TGGACAGCCA ATCACGGGTA 60 

AGAGAGTGAC ATTTTTCACT AACCTAAGAC AGGAGGGCCG TCAGAGCTAC TGCCTAATCC 120 

AAAGACGGGT AAAAGTGATA AAAATGTATC ACTCCAACCT AAGACAGGCG CAGCTTCCGA 180 

GGGATTTCTC GTCTCTTTTA TATATATTAA AAAGGGTGAC CTGTCCGGAG CCGTGCTGCC 240 

CGGATCATCT CTTCGGATAT CGCCCGGGC 269 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

Arg lie His He Gly Pro Gly Arg Ala Phe Tyr Thr Thr Lys Asn 
15 10 15 
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(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2€ base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

GAAAGAGCAG AAGACAGTGG CAATGA 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc r "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GGGCTTTGCT AAATGGGTGG CAAGTGGCCC GGGCATGTGG 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Lys Gin He He Asn Met Trp Gin Glu Val Gly Lys Ala Met Tyr Ala 
IS 10 15 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
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(B) TYPE: eunino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

His Glu Asp He He Ser Leu Trp Asp Gin Ser Leu Lys 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 

Asp Arg Val He Glu Val Val Gin Gly Ala Tyr Arg Ala He Arg 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3547 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: both 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



GATATTGGCT 


ATTGGCCATT 


GCATACGTTG 


TATCCATATC 


ATAATATGTA 


CATTTATATT 


60 


GGCTCATGTC 


CAACATTACC 


GCCATGTTGA 


CATTGATTAT 


TGACTAGTTA 


TTAATAGTAA 


120 


TCAATTACGG 


GGTCATTAGT 


TCATAGCCCA 


TATATGGAGT 


TCCGCGTTAC 


ATAACTTACG 


180 


GTAAATGGCC 


CGCCTGGCTG 


ACCGCCCAAC 


GACCCCCGCC 


CATTGACGTC 


AATAATGACG 


240 


TATGTTCCCA 


TAGTAACGCC 


AATAGGGACT 


TTCCATTGAC 


GTCAATGGGT 


GGAGTATTTA 


300 


CGGTAAACTG 


CCCACTTGGC 


AGTACATCAA 


GTGTATCATA 


TGCCAAGTAC 


GCCCCCTATT 


360 


GACGTCAATG 


ACGGTAAATG 


GCCCGCCTGG 


CATTATGCCC 


AGTACATGAC 


CTTATGGGAC 


420 


TTTCCTACTT 


GGCAGTACAT 


CTACGTATTA 


GTCATCGCTA 


TTACCATGGT 


GATGCGGTTT 


480 
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TGGCAGTACA TCAATGGGCG TGGATAGCGG TTTGACTCAC GGGGATTTCC AAGTCTCCAC 540 

CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC AACGGGACTT TCCAAAATCT 600 

CGTAACAACT CCGCCCCATT GACGCAAATG GGCGGTAGGC GTGTACGGTG GGAGGTCTAT 660 

ATAAGCAGAG CTCGTTTAGT GAACCGTCAG ATCGCCTGGA GACGCCATCC ACGCTGTTTT 720 

GACCTCCATA GAAGACACCG GGACCGATCC AGCCTCCGCG GCCGGGAACG GTGCArTOGA 780 

ACGCGGATTC CCCGTGCCAA GAGTGACGTA AGTACCGCCT ATAGAGTCTA TAGGCCCACC 84 0 

CCCTTGGCTT CTTATGCATG CTATACTGTT TTTGGCTTGG GGTCTATACA CCCCCGCTTC 900 

CTCATGTTAT AGGTGATGGT ATAGCTTAGC CTATAGGTGT GGGTTATTGA CCATTATTGA 960 

CCACTCCCCT ATTGGTGACG ATACTTTCCA TTACTAATCC ATAACATCGC TCTTTGCCAC 1020 

AACTCTCTTT ATTGGCTATA TGCCAATACA CTGTCCTTCA GAGACTX3ACA CGGACTCTGT 1080 

ATTTTTACAG GATGGGGTCT CATTTATTAT TTACAAATTC ACATATACAA CACCACCGTC 1140 

CCCAGTGCCC GCAGTTTTTA TTAAACATAA CGTGGGATCT CCACGCGAAT CTCGGGTACG 1200 

TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCTT CTACATCCGA GCCCTGCTCC 1260 

CATGCCTCCA GCGACTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT GGAGGCCAGA 1320 

CTTAGGCACA GCACGATGCC CACCACCACC AGTGTGCCGC ACAAGGCCGT GGCGGTAGGG 1380 

TATGTGTCTG AAAATGAGCT CGGGGAGCGG GCTTGCACCG CTGACGCATT TCGAAGACTT 1440 

AAGGCAGCGG CAGAAGAAGA TGCAGGCAGC TGAGTTGTTG TGTTCTGATA AGAGTCAGAG 1500 

GTAACTCCCG TTGCGGTGCT GTTAACGGTG GAGGGCAGTG TAGTCTCAGC AGTACTCGTT 1560 

GCTGCCGCGC GCGCCACCAG ACATAATAGC TGACAGACTA ACAGACTGTT CCTTTCCATG 1620 

GGTCTTTTCT GCAGTCACCG TCCTTAGATC TGCTGTGCCT TCTAGTTCCC AGCCATCTGT 1680 

TGTTTGCCCC TCCCCCGTGC CTTCCTTGAC CCTGGAAGGT GCCACTCCCA CTGTCCTTTC 1740 

CTAATAAAAT GAGGAAATTG CATCGCATTG TCTGAGTAGG TGTCATTCTA TTCTGGGGGG 1800 

TGGGGTGGGG CAGCACAGCA AGGGGGAGGA TTGGGAAGAC AATAGCAGGC ATGCTGGGGA 1860 

TGCGGTGGGC TCTATGGGTA CGGCCGCAGC GGCCGTACCC AGGTCCTGAA GAATTGACCC 1920 
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GGTTCCTCGA CCCGTAAAAA GGCCGCGTTG CTGGCGTTTT TCCATAGGCT CCGCCCCCCT 1980 

GACGAGCATC ACAAAAATCG ACGCTCAAGT CAGAGGTGGC GAAACCCGAC AGGACTATAA 2040 

AGATACCAGG CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG 2100 

CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC TCAATGCTCA 2160 

CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA AGCTGGGCTG TGTGCACGAA 2220 

CCCCCCGTTC AGCCCGACCG CTGCGCCTTA TCCGGTAACT ATCGTCTTGA GTCCAACCCG 2280 

GTAAGACACG ACTTATCGCC ACTGGCAGCA GCCACTGGTA ACAGGATTAG CAGAGCGAGG 2340 

TATGTAGGCG GTGCTACAGA GTTCTTGAAG TGGTGGCCTA ACTACGGCTA CACTAGAAGG 2400 

ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT TCGGAAAAAG AGTTGGTAGC 2460 

TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT TTTTTGTTTG CAAGCAGCAG 2520 

ATTACGCGCA GAAAAAAAGG ATCTCAAGAA GATCCTTTGA TCTTTTCTAC GTGATCCCGT 2580 

AATGCTCTGC CAGTGTTACA ACCAATTAAC CAATTCTGAT TAGAAAAACT CATCGAGCAT 2640 

CAAATGAAAC TGCAATTTAT TCATATCAGG ATTATCAATA CCATATTTTT GAAAAAGCCG 2700 

TTTCTGTAAT GAAGGAGAAA ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA 2760 

TCGGTCTGCG ATTCCGACTC GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA 2820 

AATAAGGTTA TCAAGTGAGA AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA 2880 

AAGCTTATGC ATTTCTTTCC AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA 2940 

ATCACTCGCA TCAACCAAAC CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC 3000 

GCGATCGCTG TTAAAAGGAC AATTACAAAC AGGAATCGAA TGCAACCGGC GCAGGAACAC 3 060 

TGCCAGCGCA TCAACAATAT TTTCACCTGA ATCAGGATAT TCTTCTAATA CCTGGAATGC 3120 

TGTTTTCCCG GGGATCGCAG TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG 3180 

CTTGATGGTC GGAAGAGGCA TAAATTCCGT CAGCCAGTTT AGTCTGACCA TCTCATCTGT 3240 

AACATCATTG GCAACGCTAC CTTTGCCATG TTTCAGAAAC AACTCTGGCG CATCGGGCTT 3300 

CCCATACAAT CGATAGATTG TCGCACCTGA TTGCCCGACA TTATCGCGAG CCCATTTATA 33 60 

CCCATATAAA TCAGCATCCA TGTTGGAATT TAATCGCGGC CTCGAGCAAG ACGTTTCCCG 3420 
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TTGAATATGG CTCATAACAC CCCTTGTATT ACTGTTTATG TAAGCAGACA GTTTTATTGT 3480 
TCATGATGAT ATATTTTTAT CTTGTGCAAT GTAACATCAG AGATTTTGAG ACACAACGTG 3 540 
GCTTTCC 



3547 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "olignucleotide- 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GGTACAAATA TTGGCTATTG GCCATTGCAT ACG 33 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

CCACATCTCG AGGAACCGGG TCAATCCTCC AGCACC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "oligonucleotide" 



36 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
GGTACAGATA TCGGAAAGCC ACGTTGTGTC TCAAAATC 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 

CCACATGGAT CCGTAATGCT CTGCCAGTGT TACAACC 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GGTACATGAT CACGTAGAAA AGATCAAAGG ATCTTCTTG 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

CCACATGTCG ACCCGTAAAA AGGCCGCGTT GCTGG 
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(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GAGCCAATAT AAATGTAC 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide- 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CAATAGCAGG CATGC 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

GCAAGCAGCA GATTAC 
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(2) INFORMATION FOR SEQ ID NO:53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Glu Leu Asp Lys Trp Ala 
1 5 
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WHAT IS CLAIMED IS: 



1 . A synthetic polynucleotide comprising a DNA 
sequence encoding a peptide or protein, the DNA sequence comprising 

5 codons optimized for expression in a nonhomologous host. 

2. The synthetic polynucleotide of Claim 1 wherein the 
protein is an HIV protein. 

10 3. The synthetic polynucleotide of Claim 1 wherein the 

DNA sequence encodes HIV env protein or a fragment thereof, the 
DNA sequence comprising codons optimized for expression in a 
mammalian host. 

15 4. The polynucleotide of Claim 3 which is selected 



from: 



VlJns-tPA-HIVMN gpl20; 
VlJns-tPA-HIVniB gpl20; 



20 



VI Jns-tPA-gp 140/mutRRE-A/SRV- 1 3'-UTR; 
VlJns-tPA-gpl40/mutRRE-B/SRV-l 3-UTR; 



25 



V 1 Jns-tPA-gp 1 40/opt30- A; 
V 1 Jns-tPA-gp 1 40/opt30-B; 
VlJns-tPA-gpl40/opt all-A; 
VlJns-tPA-gpl40/opt all-B; 
VI Jns-tPA-gp 140/opt all-A; 
VlJns-tPA-gpl40/opt all-B; 
VlJns-r€v/e«v:; 



30 



VlJns-gpl60; 
VlJns-tPA-gpl60; 
VlJns-tPA-gpl60/opt Cl/opt41-A; 
V IJns-tPA-gp 1 60/opt C 1 /opt4 1 -B ; 



VI Jns-tPA-gp 160/opt all-A; 
VlJns-tPA-gpl60/opt all-B; 
VlJns-tPA-gp 160/opt all-A; 
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VlJns-tPA-gpl60/opt all-B; 

VlJns-tPA-gpl43; 

VlJns-tPA-gpl43/mutRRE-A; 

VlJns-tPA-gpl43/mutRRE-B; 
5 VlJns-tPA-gpl43/opt32-A; 

VlJns-tPA-gpl43/opt32-B; 

VlJns-tPA-gpl43/SRV-l 3'-UTR; 

VlJns-tPA-gpl43/opt Cl/opt32A; 

VlJns-tPA-gpl43/opt Cl/opt32B; 
10 VlJns-tPA-gpl43/opt all-A 

VlJns-tPA-gpl43/opt all-B 

VlJns-tPA-gpl43/opt all-A 

VlJns-tPA-gpl43/opt all-B 

VlJns-tPA.gpl43/opt32-A/glyB; 
15 VlJns-tPA-gpl43/opt32-B/glyB; 

VlJns-tPA-gpl43/optCl/opt32-A/glyB; 

VlJns-tPA-gpl43/optCl/opt32-B/glyB; 

VlJns-tPA-gpl43/opt all-A/glyB 

VlJns-tPA-gpl43/opt all-B/glyB 
20 VlJns-tPA-gpl43/opt all-A/glyB 

VlJns-tPA-gpl43/opt all-B/glyB 



thereof. 



and combinations 



5. The polynucleotide of Claim 2 which induces anti- 
25 HIV neutralizing antibody, HIV specific T-cell immune responses, or 
protective immune responses upon introduction into vertebrate tissue, 
including human tissue in vivo , wherein said polynucleotide comprises a 
gene encoding an HIV gag , HIV protease and combinations thereof. 

30 6. A method for inducing immune responses in a 

vertebrate against HIV epitopes which comprises introducing between 1 
ng and 100 mg of the polynucleotide of Claim 2 into the tissue of the 
vertebrate. 
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7. A method for using a rev independent HIV gene to 
induce immune responses in vivo which comprises: 

a) synthesizing the rev independent HIV gene; 

b) linking the synthesized gene to regulatory 

5 sequences such that the gene is expressible by virtue of being operatively 
Hnked to control sequences which, when introduced into a living tissue, 
direct the transcription initiation and subsequent translation of the gene; 

8. A method for inducing immune responses against 

1 0 infection or disease caused by virtilent strains of HIV which comprises 
introducing into the tissue of a vertebrate the polynucleotide of Claim 2. 

9. A va:ccine for inducing immune responses against 
HIV infection which comprises the polynucleotide of Claim 2 and a 

15 pharmaceutically acceptable carrier. 

10. A method for inducing anti-HFV immune responses 
in a primate which comprises introducing the polynucleotide of Claim 2 
into the tissue of the primate and concurrently administering interleukin 

20 12, GM-CSF, or combinations thereof parenteral ly. 

11. A method of inducing an antigen presenting cell to 
stimulate cytotoxic and helper T-cell proliferation an effector functions 
including lymphokine secretion specific to HIV antigens which 

25 comprises exposing cells of a vertebrate jn vivo to the polynucleotide of 
Claim 2. 

12. A method of increasing rev independent in vivo 
expression of DNA encoding HIV env or a fragment thereof, 

30 comprising: 

(a) identifying placement of codons for proper open 
reading frame; 
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(b) comparing wild type codons for observed frequency 
of use by human genes; 

(c) replacing wild-type codons with codons optimized 
for high expression of human genes; and 

5 (d) testing for improved expression. 

13. A vaccine for inducing immune responses against 
HIV infection which comprises the polynucleotide of Claim 2 wherein 
the polynucleotide is delivered by a canarypox, vaccinia virus, 

10 adenovirus, adeno-associated virus, retrovirus, Listeria, Shigella, 
specific ligand, BCG, or salmonella. 

14. A method of inducing an immune response to HTV 
which comprises administration of the polynucleotide of Claim 2 and 

15 administration of an attenuated HIV, a killed HIV, an HIV protein, a 
fragment of an HIV protein, or combinations thereof, wherein the 
administration of the polynucleotide is prior to or simultaneous with or 
subsequent to the administration of the attenuated HIV, the killed HIV, 
the HIV protein, the fragment of the HIV protein or the combinations 

20 thereof. 

15. A method of inducing an immune response to HIV 
which comprises administration of the polynucleotide of Claim 2 with 
an adjuvant. 

25 

16. A method of treating HIV infection which comprises 
administration of the polynucleotide of Claim 2 to a parient and 
administration of an anti-HFV compound to the patient, wherein the 
administration of the polynucleotide is prior to or simultaneous with or 

30 subsequent to the administration of the anti-HIV compound. 

17. A method of increasing expression of a gene in a 
nonhomologous host, comprising: 
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a) comparing codons of a wild type gene to 
codons preferred by the nonhomologous host; 

b) replacing codons of the wild type gene with 
new codons, the new codons having a DNA sequence preferred by the 

5 nonhomologous host: 

c) inspecting third nucleotides of the new codons 
and first nucleotides of adjacent new codon inmiediately 3'- of the first, 
and if a 5'-CG-3' pairing has been created by the new codon selection, 
replacing it; 

^ ^ d) eliminating undesired sequences to yield a 

synthetic optimized gene; and 

e) inserting the synthetic gene into the 
nonhomologous host. 

1 8. A method of expressing a peptide in a host 
comprising administration of the synthetic polynucleotide of Claim 1 to 
the host. 

1 9. A method of increasing production of a recombinant 
20 protein by a host, comprising: 

a) transforming a host cell with the synthetic 
polynucleotide of Claim 1 to produce a transformed host; and 

b) cultivating the transformed host under 
conditions that pemiit expression of the synthetic polynucleotide and 

25 production of the recombinant protein. 
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RECIPROCAL DILUTION 
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Cell Pellets 
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